2021
DOI: 10.1609/aaai.v35i15.17601
|View full text |Cite
|
Sign up to set email alerts
|

MASKER: Masked Keyword Regularization for Reliable Text Classification

Abstract: Pre-trained language models have achieved state-of-the-art accuracies on various text classification tasks, e.g., sentiment analysis, natural language inference, and semantic textual similarity. However, the reliability of the fine-tuned text classifiers is an often underlooked performance criterion. For instance, one may desire a model that can detect out-of-distribution (OOD) samples (drawn far from training distribution) or be robust against domain shifts. We claim that one central obstacle to the reliabili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(14 citation statements)
references
References 32 publications
0
8
0
Order By: Relevance
“…Here, we note that the definition of bias is more general, and many biases cannot be represented as majority bias. For example, data can contain multiple attributes that some attributes dominate the prediction, e.g., texture bias (Geirhos et al, 2019) for images or keyword bias (Moon et al, 2021) for texts. This bias appears in every sample and does not belong to the specific subgroup of the dataset.…”
Section: F Additional Discussion On Bias Typesmentioning
confidence: 99%
“…Here, we note that the definition of bias is more general, and many biases cannot be represented as majority bias. For example, data can contain multiple attributes that some attributes dominate the prediction, e.g., texture bias (Geirhos et al, 2019) for images or keyword bias (Moon et al, 2021) for texts. This bias appears in every sample and does not belong to the specific subgroup of the dataset.…”
Section: F Additional Discussion On Bias Typesmentioning
confidence: 99%
“…Dan and Roth (2021) conduct an empirical study of the effects of model capacity on PLMs and show that smaller pre-trained transformers provide more reliable predictions. Moon et al (2020) find that PLMs tend to produce over-confident outputs based on in-distribution (ID) keywords rather than contextual relations between words. They demonstrate that keyword-biased predictions can be overconfident even in out-of-distribution samples with ID keywords.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, interpretable methods are adopted to facilitate automatic identification of robust/nonrobust region at scale, e.g. attention score [23], mutual information [10] and integrated gradient [24], [25]. Besides, counterfactual causal inference is also used to determine the importance of a token by adding perturbation to the token [25], [26].…”
Section: Shortcuts and Causal Featuresmentioning
confidence: 99%
“…Multiple approaches have been studied for shortcut mitigation and robust model learning such as domain adaptation [27] and multi-task learning [28]. Under the premise of given shortcuts or causal features, then it is easy to guide the model correctly by adversarial training [29], reweighting [30], Product-of-Expert [31], knowledge distillation [32], keyowords regularization [23] and contrastive learning [25]. Recently, researchers have developed counterfactual data augmentation methods to build robust classifiers, achieving state-of-the-art results [13].…”
Section: Shortcut Mitigation and Robust Model Learningmentioning
confidence: 99%
See 1 more Smart Citation