Identifying preferences for mobile health applications for self-monitoring and self-management: Focus group findings from HIV-positive persons and young mothers

Evaluating bias, fairness, and social impact in monolingual language models is a difficult task. This challenge is further compounded when language modeling occurs in a multilingual context. Considering the implication of evaluation biases for large multilingual language models, we situate the discussion of bias evaluation within a wider context of social scientific research with computational work. We highlight three dimensions of developing multilingual bias evaluation frameworks: (1) increasing transparency through documentation, (2) expanding targets of bias beyond gender, and (3) addressing cultural differences that exist between languages. We further discuss the power dynamics and consequences of training large language models and recommend that researchers remain cognizant of the ramifications of developing such technologies.

show abstract

A Word on Machine Ethics: A Response to Jiang et al. (2021)

Talat¹,

Blix²,

Valvoda³

et al. 2021

Preprint

View full text Add to dashboard Cite

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

Röttger¹,

Seelawi²,

Nozza³

et al. 2022

View full text Add to dashboard Cite

Hate speech detection models are typically evaluated on held-out test sets. However, this risks painting an incomplete and potentially misleading picture of model performance because of increasingly well-documented systematic gaps and biases in hate speech datasets. To enable more targeted diagnostic insights, recent research has thus introduced functional tests for hate speech detection models. However, these tests currently only exist for English-language content, which means that they cannot support the development of more effective models in other languages spoken by billions across the world. To help address this issue, we introduce MULTILINGUAL HATECHECK (MHC), a suite of functional tests for multilingual hate speech detection models. MHC covers 34 functionalities across ten languages, which is more languages than any other hate speech dataset. To illustrate MHC's utility, we train and test a highperforming multilingual hate speech detection model, and reveal critical model weaknesses for monolingual and cross-lingual applications.

show abstract

On the Machine Learning of Ethical Judgments from Natural Language

Talat¹,

Blix²,

Valvoda³

et al. 2022

View full text Add to dashboard Cite

Ethics is one of the longest standing intellectual endeavors of humanity. In recent years, the fields of AI and NLP have attempted to address ethical issues of harmful outcomes in machine learning systems that are made to interface with humans. One recent approach in this vein is the construction of NLP morality models that can take in arbitrary text and output a moral judgment about the situation described. In this work, we offer a critique of such NLP methods for automating ethical decision-making. Through an audit of recent work on computational approaches for predicting morality, we examine the broader issues that arise from such efforts. We conclude with a discussion of how machine ethics could usefully proceed in NLP, by focusing on current and near-future uses of technology, in a way that centers around transparency, democratic values, and allows for straightforward accountability.

show abstract

Data Governance in the Age of Large-Scale Data-Driven Language Technology

Jernite¹,

Nguyen²,

Biderman³

et al. 2022

View full text Add to dashboard Cite

Queer In AI: A Case Study in Community-Led Participatory AI

AI¹,

Ovalle²,

Subramonian

et al. 2023

View full text Add to dashboard Cite

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources

McMillan-Major¹,

Alyafeai²,

Biderman³

et al. 2022

Preprint

View full text Add to dashboard Cite

In recent years, large-scale data collection efforts have prioritized the amount of data collected in order to improve the modeling capabilities of large language models. This prioritization, however, has resulted in concerns with respect to the rights of data subjects represented in data collections, particularly when considering the difficulty in interrogating these collections due to insufficient documentation and tools for analysis. Mindful of these pitfalls, we present our methodology for a documentation-first, human-centered data collection project as part of the BigScience initiative. We identified a geographically diverse set of target language groups (Arabic, Basque, Chinese, Catalan, English, French, Indic languages, Indonesian, Niger-Congo languages, Portuguese, Spanish, and Vietnamese, as well as programming languages) for which to collect metadata on potential data sources. To structure this effort, we developed our online catalogue as a supporting tool for gathering metadata through organized public hackathons. We present our development process; analyses of the resulting resource metadata, including distributions over languages, regions, and resource types; and our lessons learned in this endeavor.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zeerak Talat

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings

A Word on Machine Ethics: A Response to Jiang et al. (2021)

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

On the Machine Learning of Ethical Judgments from Natural Language

Data Governance in the Age of Large-Scale Data-Driven Language Technology

Queer In AI: A Case Study in Community-Led Participatory AI

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources

Contact Info

Product

Resources

About