SetConv: A New Approach for Learning from Imbalanced Data

Gao, Yang; Li, Yifan; Yang, Lin; Aggarwal, Charų C.; Khan, Latifur

doi:10.18653/v1/2020.emnlp-main.98

Cited by 9 publications

(6 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2) MC over-predicts majority classes in both datasets (2s and 3s for MR and 1s and 10s for IMDb) while under-predicting the others (except 2s and 3s in IMDb). These results are in line with the common observation that MC models tend to overfit on the majority classes in im- balanced datasets, which motivates the use of "oversampling" or class balancing (Buda et al, 2018;Chawla et al, 2002;Tepper et al, 2020;Gao et al, 2020). OR, in contrast, provides a better fit for MR (slightly under-predicting for 1s), but significantly under-predicts on IMDb majority classes, displaying a much flatter distribution of predictions.…”

Section: Dataset Benchmarkssupporting

confidence: 83%

Error-Sensitive Evaluation for Ordinal Target Variables

Chen¹,

Courtland²,

Faulkner³

et al. 2021

Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems

View full text Add to dashboard Cite

Product reviews and satisfaction surveys seek customer feedback in the form of ranked scales. In these settings, widely used evaluation metrics including F1 and accuracy ignore the rank in the responses (e.g., 'very likely' is closer to 'likely' than 'not at all'). In this paper, we hypothesize that the order of class values is important for evaluating classifiers on ordinal target variables and should not be disregarded. To test this hypothesis, we compared Multi-class Classification (MC) and Ordinal Regression (OR) by applying OR and MC to benchmark tasks involving ordinal target variables using the same underlying model architecture. Experimental results show that while MC outperformed OR for some datasets in accuracy and F1, OR is significantly better than MC for minimizing the error between prediction and target for all benchmarks, as revealed by error-sensitive metrics, e.g. mean-squared error (MSE) and Spearman correlation. Our findings motivate the need to establish consistent, error-sensitive metrics for evaluating benchmarks with ordinal target variables, and we hope that it stimulates interest in exploring alternative losses for ordinal problems.

show abstract

Section: Dataset Benchmarkssupporting

confidence: 83%

Error-Sensitive Evaluation for Ordinal Target Variables

Chen¹,

Courtland²,

Faulkner³

et al. 2021

Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems

View full text Add to dashboard Cite

show abstract

“…More concrete definitions, e.g., regarding the relative share up to which a class is seen as a minority class, depend on the task, dataset and labelset size. Much research focuses on improving all minority classes equally while maintaining or at least monitoring majority class performance (e.g., Huang et al, 2021;Yang et al, 2020;Spangher et al, 2021). We next discuss prototypical types of imbalance (Sec.…”

Section: Problem Definitionmentioning

confidence: 99%

“…Under imbalance, two issues arise. First, although class-specific weights have been used with BCE (e.g., Yang et al, 2020), their effect on minority classes is less clear than in the single-label case. For each instance, all classes contribute to BCE, with the labels not assigned to the instance (called negative classes) included via (1−y j ) log(1−p j ).…”

Section: Loss Functionsmentioning

confidence: 99%

“…SetConv (Gao et al, 2020) and ProtoBERT (Tänzer et al, 2022) learn representatives for each class using support sets and classify an input (the query) based on its similarity to these representatives. SetConv applies convolution kernels that capture intra-and inter-class correlations to extract class representatives.…”

Section: Model Designmentioning

confidence: 99%

“…Which categories matter is highly task-specific and may even depend on the intended downstream use. Developing methods that improve model performance in imbalanced data settings has been an active area for decades (e.g., Bruzzone and Serpico, 1997;Japkowicz et al, 2000;Estabrooks and Japkowicz, 2001;Park and Zhang, 2002;Tan, 2005), and is recently gaining momentum in the context of maturing neural approaches (e.g., Buda et al, 2018;Kang et al, 2020;Yang et al, 2020;Jiang et al, 2021;Spangher et al, 2021). The problem is exacerbated when classes overlap in the feature space (Lin et al, 2019;Tian et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing

Henning,

Beluch,

Fraser

et al. 2023

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

View full text Add to dashboard Cite

Many natural language processing (NLP) tasks are naturally imbalanced, as some target categories occur much more frequently than others in the real world. In such scenarios, current NLP models tend to perform poorly on less frequent classes. Addressing class imbalance in NLP is an active research topic, yet, finding a good approach for a particular task and imbalance scenario is difficult.In this survey, the first overview on class imbalance in deep-learning based NLP, we first discuss various types of controlled and realworld class imbalance. Our survey then covers approaches that have been explicitly proposed for class-imbalanced NLP tasks or, originating in the computer vision community, have been evaluated on them. We organize the methods by whether they are based on sampling, data augmentation, choice of loss function, staged learning, or model design. Finally, we discuss open problems and how to move forward.

show abstract