2024
DOI: 10.1109/access.2024.3360306
|View full text |Cite
|
Sign up to set email alerts
|

Shortcut Learning Explanations for Deep Natural Language Processing: A Survey on Dataset Biases

Varun Dogra,
Sahil Verma,
Kavita
et al.

Abstract: The introduction of pre-trained large language models (LLMs) has transformed NLP by finetuning task-specific datasets, enabling notable advancements in news classification, language translation, and sentiment analysis. This has revolutionized the field, driving remarkable breakthroughs and progress. However, the growing recognition of bias in textual data has emerged as a critical focus in the NLP community, revealing the inherent limitations of models trained on specific datasets. LLMs exploit these dataset b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 68 publications
(74 reference statements)
0
1
0
Order By: Relevance
“…Changing the dataset utilized by the recommendation system to, for example, a company's dataset results in recommended items being focused on previous ideas, patents, or products. Dataset biases must be addressed as this is a common challenge in natural language processing [60]. During our work, we utilized human validation to identify various issues, including biases in the data.…”
Section: Discussionmentioning
confidence: 99%
“…Changing the dataset utilized by the recommendation system to, for example, a company's dataset results in recommended items being focused on previous ideas, patents, or products. Dataset biases must be addressed as this is a common challenge in natural language processing [60]. During our work, we utilized human validation to identify various issues, including biases in the data.…”
Section: Discussionmentioning
confidence: 99%