Anubrata Das scite author profile

We examine the issue of bias in datasets designed to train visual question answering (VQA) algorithms. These datasets include a collection of natural language questions about images (aka ‐ visual questions). We consider three popular datasets that are captured by people with sight, people who are blind, and generated by computers. We first demonstrate that machine learning algorithms can be trained to recognize each dataset's bias, and so determine the source of a novel visual question. We then discuss potential risks and benefits of biased VQA datasets and corresponding machine learning algorithms that can identify the source of a visual question; e.g., whether it comes from a person with sight, a person who is blind, or bot (aka ‐ computer). Our ultimate aim is to inspire the development of more inclusive VQA systems.

show abstract

The state of human-centered NLP technology for fact-checking

Das

Liu

Kovatchev

et al. 2023

Information Processing & Management

View full text Add to dashboard Cite

Interactive information crowdsourcing for disaster management using SMS and Twitter: A research prototype

Das

Mallik

Bandyopadhyay

et al. 2016

View full text Add to dashboard Cite

Fairness in Recommender Systems

Ekstrand

Das

Burke

et al. 2012

View full text Add to dashboard Cite

Facts-Ir

Olteanu¹,

Garcia-Gathright²,

Rijke³

et al. 2019

SIGIR Forum

View full text Add to dashboard Cite

The purpose of the SIGIR 2019 workshop on Fairness, Accountability, Confidentiality, Transparency, and Safety (FACTS-IR) was to explore challenges in responsible information retrieval system development and deployment. To this end, the workshop aimed to crowd-source from the larger SIGIR community and draft an actionable research agenda on five key dimensions of responsible information retrieval: fairness, accountability, confidentiality, transparency, and safety. Such an agenda can guide others in the community that are interested in pursuing FACTS-IR research, as well as inform potential funders about relevant research avenues. The workshop brought together a diverse set of researchers and practitioners interested in contributing to the development of a technical research agenda for responsible information retrieval.

show abstract

ProtoTEx: Explaining Model Decisions with Prototype Tensors

Das¹,

Chitrank²,

Kovatchev³

et al. 2022

View full text Add to dashboard Cite

We present PROTOTEX, a novel white-box NLP classification architecture based on prototype networks (Li et al., 2018). PROTOTEX faithfully explains model decisions based on prototype tensors that encode latent clusters of training examples. At inference time, classification decisions are based on the distances between the input text and the prototype tensors, explained via the training examples most similar to the most influential prototypes. We also describe a novel interleaved training algorithm that effectively handles classes characterized by the absence of indicative features. On a propaganda detection task, PROTOTEX accuracy matches BART-large and exceeds BERTlarge with the added benefit of providing faithful explanations. A user study also shows that prototype-based explanations help non-experts to better recognize propaganda in online news.

show abstract

Predicting Trends in the Twitter Social Network: A Machine Learning Approach

Das

Roy

Dutta

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anubrata Das

Fairness in Information Access Systems

Dataset bias: A case study for visual question answering

The state of human-centered NLP technology for fact-checking

Interactive information crowdsourcing for disaster management using SMS and Twitter: A research prototype

Fairness in Recommender Systems

Facts-Ir

ProtoTEx: Explaining Model Decisions with Prototype Tensors

Predicting Trends in the Twitter Social Network: A Machine Learning Approach

Contact Info

Product

Resources

About