2023
DOI: 10.48550/arxiv.2301.12715
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features

Abstract: Detecting out-of-distribution (OOD) inputs is crucial for the safe deployment of natural language processing (NLP) models. Though existing methods, especially those based on the statistics in the feature space of fine-tuned pretrained language models (PLMs), are claimed to be effective, their effectiveness on different types of distribution shifts remains underexplored. In this work, we take the first step to comprehensively evaluate the mainstream textual OOD detection methods for detecting semantic and non-s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 25 publications
(62 reference statements)
0
0
0
Order By: Relevance
“…We compare the proposed FLatS with 9 popular OOD detection methods. (1) For confidence-based methods that leverages output probabilities of classifiers trained on IND data to detect OOD samples, we evaluate MSP , energy score (Liu et al, 2020), ODIN (Liang et al, 2017), D2U (Chen et al, 2023), MLS (Hendrycks et al, 2019); (2) For distancebased methods, we test LOF (Breunig et al, 2000), Maha , KNN , and GNOME (Chen et al, 2023).…”
Section: Datasets and Baselinesmentioning
confidence: 99%
See 1 more Smart Citation
“…We compare the proposed FLatS with 9 popular OOD detection methods. (1) For confidence-based methods that leverages output probabilities of classifiers trained on IND data to detect OOD samples, we evaluate MSP , energy score (Liu et al, 2020), ODIN (Liang et al, 2017), D2U (Chen et al, 2023), MLS (Hendrycks et al, 2019); (2) For distancebased methods, we test LOF (Breunig et al, 2000), Maha , KNN , and GNOME (Chen et al, 2023).…”
Section: Datasets and Baselinesmentioning
confidence: 99%
“…Hyperparameters. We use k = 10 for KNN following (Chen et al, 2023). Searching from {0.1, 0.2, 0.5, 1.0, 2.0}, we adopt α = 0.5 for Equation (5).…”
Section: Implementation Detailsmentioning
confidence: 99%