FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing

Chalkidis, Ilias; Pasini, Tommaso; Zhang, Sheng; Tomada, Letizia; Schwemer, Sebastian Felix; Søgaard, Anders

doi:10.18653/v1/2022.acl-long.301

Cited by 12 publications

(20 citation statements)

References 19 publications

(25 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A similar effect is also observed by Baldini et al (2022). Chalkidis et al (2022) examine the effectiveness of debiasing methods over a multi-lingual benchmark dataset consisting of four subsets of legal documents, covering five languages and various sensitive attributes. They find that methods aim-ing to improve worse-case performance tend to fail in more realistic settings, where both target label and protected attribute distributions vary over time.…”

Section: Effectiveness Of Debiasing Methodssupporting

confidence: 62%

“…Beyond the standard definitions of fairness, a number of studies have examined the effectiveness of various debiasing methods in additional settings (Gonen and Goldberg, 2019;Meade et al, 2021;Lamba et al, 2021;Baldini et al, 2022;Chalkidis et al, 2022). For example, Meade et al (2021) not only examine the effectiveness of various debiasing methods but also measure the impact of debiasing methods on a model's language modeling ability and downstream task performance.…”

Section: Effectiveness Of Debiasing Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Systematic Evaluation of Predictive Fairness

Han¹,

Shen²,

Cohn³

et al. 2022

Preprint

View full text Add to dashboard Cite

Mitigating bias in training on biased datasets is an important open problem.Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of target class imbalance and stereotyping is under-studied. To address this gap, we examine the performance of various debiasing methods across multiple tasks, spanning binary classification (Twitter sentiment), multi-class classification (profession prediction), and regression (valence prediction). Through extensive experimentation, we find that data conditions have a strong influence on relative model performance, and that general conclusions cannot be drawn about method efficacy when evaluating only on standard datasets, as is current practice in fairness research. Our code is available at: https: //github.com/HanXudong/Systematic_ Evaluation_of_Predictive_Fairness.

show abstract

Section: Effectiveness Of Debiasing Methodssupporting

confidence: 62%

Section: Effectiveness Of Debiasing Methodsmentioning

confidence: 99%

Systematic Evaluation of Predictive Fairness

Han¹,

Shen²,

Cohn³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…AI/NLP researchers have uniformly adopted a Rawlsian notion of fairness. This is reflected in the by now common practice of citing Rawls when mentioning fairness [3][4][5][6][7]. Fairness plays a central role in the philosophy of John Rawls.…”

Section: Ai/nlp Fairness Is Rawlsianmentioning

confidence: 99%

“…6 Welfare, like 'benefit', is performance as measured by the go-to performance metric. 7 Rawlsian fairness thus becomes maximizing the performance on data sampled from the group on which performance is currently lowest. Many algorithms have therefore been developed to maximize performance on the groups with the worst performance.…”

Section: Ai/nlp Fairness Is Rawlsianmentioning

confidence: 99%

See 1 more Smart Citation

Rawlsian AI fairness loopholes

Jørgensen

Søgaard

2022

AI Ethics

View full text Add to dashboard Cite

Researchers and industry developers in artificial intelligence (AI) and natural language processing (NLP) have uniformly adopted a Rawlsian definition of fairness. On this definition, a technology is fair if performance is maximized for the least advantaged. We argue this definition has considerable loopholes, which can be used to legitimize common practices in AI/NLP research that actively contributes to social and economic inequalities. Such practices include what we shall refer to as Subgroup Test Ballooning and Snapshot-Representative Evaluation. Subgroup Test Ballooning refers to the practice of initially tailoring a technology to a specific target group of technology-ready early adopters to collect feedback faster. Snapshot-Representative Evaluation refers to the practice of evaluating a technology on a representative sample of current end users. Both strategies may contribute to social and economic inequalities but are commonly justified using arguments familiar from political economics and grounded in Rawlsian fairness. We discuss an egalitarian alternative to Rawlsian fairness, as well as, more generally, the roadblocks on the path toward globally and socially fair AI/NLP research and development.

show abstract

Legal Knowledge Representation Learning

Xiao,

Liu,

Lin

et al. 2023

Representation Learning for Natural Language Processing

View full text Add to dashboard Cite

The law guarantees the regular functioning of the nation and society. In recent years, legal artificial intelligence (legal AI), which aims to apply artificial intelligence techniques to perform legal tasks, has received significant attention. Legal AI can provide a handy reference and convenient legal services for legal professionals and non-specialists, thus benefiting real-world legal practice. Different from general open-domain tasks, legal tasks have a high demand for understanding and applying expert knowledge. Therefore, enhancing models with various legal knowledge is a key issue of legal AI. In this chapter, we summarize the existing knowledge-intensive legal AI approaches regarding knowledge representation, acquisition, and application. Besides, future directions and ethical considerations are also discussed to promote the development of legal AI.

show abstract

FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing

Cited by 12 publications

References 19 publications

Systematic Evaluation of Predictive Fairness

Systematic Evaluation of Predictive Fairness

Rawlsian AI fairness loopholes

Legal Knowledge Representation Learning

Contact Info

Product

Resources

About