Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Me 2021
DOI: 10.26615/978-954-452-072-4_148
|View full text |Cite
|
Sign up to set email alerts
|

Towards Domain-Generalizable Paraphrase Identification by Avoiding the Shortcut Learning

Abstract: In this paper, we investigate the Domain Generalization (DG) problem for supervised Paraphrase Identification (PI). We observe that the performance of existing PI models deteriorates dramatically when tested in an outof-distribution (OOD) domain. We conjecture that it is caused by shortcut learning, i.e., these models tend to utilize the cue words that are unique for a particular dataset or domain.To alleviate this issue and enhance the DG ability, we propose a PI framework based on Optimal Transport (OT). Our… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 15 publications
0
1
0
Order By: Relevance
“…[7] proposed a measurement for quantifying the shortcut degree, with which a shortcut mitigation framework was introduced for natural language understanding (NLU). [47] forces the network to learn the necessary features for all the words in the input to alleviate the shortcut learning problem in supervised Paraphrase Identification (PI). In the medical imaging field, prior works also suggested the existence of shortcuts and proposed the strategies to neutralise shortcut learning such as removing the bias in the training dataset [26,35,41].…”
Section: Shortcut Learningmentioning
confidence: 99%
“…[7] proposed a measurement for quantifying the shortcut degree, with which a shortcut mitigation framework was introduced for natural language understanding (NLU). [47] forces the network to learn the necessary features for all the words in the input to alleviate the shortcut learning problem in supervised Paraphrase Identification (PI). In the medical imaging field, prior works also suggested the existence of shortcuts and proposed the strategies to neutralise shortcut learning such as removing the bias in the training dataset [26,35,41].…”
Section: Shortcut Learningmentioning
confidence: 99%