MASKER: Masked Keyword Regularization for Reliable Text Classification

Moon, Seung Jun; Mo, Sangwoo; Lee, Kimin; Lee, Jae-Ho; Shin, Jinwoo

doi:10.1609/aaai.v35i15.17601

Cited by 17 publications

(14 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Here, we note that the definition of bias is more general, and many biases cannot be represented as majority bias. For example, data can contain multiple attributes that some attributes dominate the prediction, e.g., texture bias (Geirhos et al, 2019) for images or keyword bias (Moon et al, 2021) for texts. This bias appears in every sample and does not belong to the specific subgroup of the dataset.…”

Section: F Additional Discussion On Bias Typesmentioning

confidence: 99%

Explaining Visual Biases as Words by Generating Captions

Kim¹,

Mo²,

Kim³

et al. 2023

Preprint

View full text Add to dashboard Cite

We aim to diagnose the potential biases in image classifiers. To this end, prior works manually labeled biased attributes or visualized biased features, which need high annotation costs or are often ambiguous to interpret. Instead, we leverage two types (generative and discriminative) of pretrained vision-language models to describe the visual bias as a word. Specifically, we propose bias-to-text (B2T), which generates captions of the mispredicted images using a pre-trained captioning model to extract the common keywords that may describe visual biases. Then, we categorize the bias type as spurious correlation or majority bias by checking if it is specific or agnostic to the class, based on the similarity of class-wise mispredicted images and the keyword upon a pretrained vision-language joint embedding space, e.g., CLIP. We demonstrate that the proposed simple and intuitive scheme can recover well-known gender and background biases, and discover novel ones in real-world datasets. Moreover, we utilize B2T to compare the classifiers using different architectures or training methods. Finally, we show that one can obtain debiased classifiers using the B2T bias keywords and CLIP, in both zero-shot and full-shot manners, without using any human annotation on the bias. 1 * Equal contribution 1 KAIST 2 POSTECH.

show abstract

Section: F Additional Discussion On Bias Typesmentioning

confidence: 99%

Explaining Visual Biases as Words by Generating Captions

Kim¹,

Mo²,

Kim³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Dan and Roth (2021) conduct an empirical study of the effects of model capacity on PLMs and show that smaller pre-trained transformers provide more reliable predictions. Moon et al (2020) find that PLMs tend to produce over-confident outputs based on in-distribution (ID) keywords rather than contextual relations between words. They demonstrate that keyword-biased predictions can be overconfident even in out-of-distribution samples with ID keywords.…”

Section: Related Workmentioning

confidence: 99%

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

Na¹,

Choi²,

Lim³

2023

Preprint

View full text Add to dashboard Cite

While pre-trained language models (PLMs) have become a de-facto standard promoting the accuracy of text classification tasks, recent studies (Kong et al., 2020;Dan and Roth, 2021) find that PLMs often predict over-confidently. Although various calibration methods have been proposed, such as ensemble learning and data augmentation, most of the methods have been verified in computer vision benchmarks rather than in PLM-based text classification tasks. In this paper, we present an empirical study on confidence calibration for PLMs, addressing three categories, including confidence penalty losses, data augmentations, and ensemble methods. We find that the ensemble model overfitted to the training set shows sub-par calibration performance and also observe that PLMs trained with confidence penalty loss have a trade-off between calibration and accuracy. Building on these observations, we propose the Calibrated PLM (CALL), a combination of calibration techniques. The CALL complements the drawbacks that may occur when utilizing a calibration method individually and boosts both classification and calibration accuracy. Design choices in CALL's training procedures are extensively studied, and we provide a detailed analysis of how calibration techniques affect the calibration performance of PLMs.

show abstract

“…Therefore, interpretable methods are adopted to facilitate automatic identification of robust/nonrobust region at scale, e.g. attention score [23], mutual information [10] and integrated gradient [24], [25]. Besides, counterfactual causal inference is also used to determine the importance of a token by adding perturbation to the token [25], [26].…”

Section: Shortcuts and Causal Featuresmentioning

confidence: 99%

“…Multiple approaches have been studied for shortcut mitigation and robust model learning such as domain adaptation [27] and multi-task learning [28]. Under the premise of given shortcuts or causal features, then it is easy to guide the model correctly by adversarial training [29], reweighting [30], Product-of-Expert [31], knowledge distillation [32], keyowords regularization [23] and contrastive learning [25]. Recently, researchers have developed counterfactual data augmentation methods to build robust classifiers, achieving state-of-the-art results [13].…”

Section: Shortcut Mitigation and Robust Model Learningmentioning

confidence: 99%

“…For positive samples, we can capture them by randomly perturbing the tokens that do not belong to word-groups. Specifically, we represent the composition of word-groups as W Gx , and mask Wx − W Gx randomly with a probability of 50% as [23]. Thus, the collection of the obtained augmented samples is written as (x, x + , x − 1 , ..., x − L ).…”

Section: Multiple Causal Contrastive Learningmentioning

confidence: 99%

See 1 more Smart Citation

Graph topology enhancement for text classification

et al. 2022

View full text Add to dashboard Cite

Despite large-scale pre-trained language models have achieved striking results for text classificaion, recent work has raised concerns about the challenge of shortcut learning. In general, a keyword is regarded as a shortcut if it creates a superficial association with the label, resulting in a false prediction. Conversely, shortcut learning can be mitigated if the model relies on robust causal features that help produce sound predictions. To this end, many studies have explored post-hoc interpretable methods to mine shortcuts and causal features for robustness and generalization. However, most existing methods focus only on single word in a sentence and lack consideration of word-group, leading to wrong causal features. To solve this problem, we propose a new Word-Group mining approach, which captures the causal effect of any keyword combination and orders the combinations that most affect the prediction. Our approach bases on effective post-hoc analysis and beam search, which ensures the mining effect and reduces the complexity. Then, we build a counterfactual augmentation method based on the multiple word-groups, and use an adaptive voting mechanism to learn the influence of different augmentated samples on the prediction results, so as to force the model to pay attention to effective causal features. We demonstrate the effectiveness of the proposed method by several tasks on 8 affective review datasets and 4 toxic language datasets, including cross-domain text classificaion, text attack and gender fairness test.

show abstract

MASKER: Masked Keyword Regularization for Reliable Text Classification

Cited by 17 publications

References 32 publications

Explaining Visual Biases as Words by Generating Captions

Explaining Visual Biases as Words by Generating Captions

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

Graph topology enhancement for text classification

Contact Info

Product

Resources

About