2020
DOI: 10.48550/arxiv.2005.05683
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the Robustness of Language Encoders against Grammatical Errors

Abstract: We conduct a thorough study to diagnose the behaviors of pre-trained language encoders (ELMo, BERT, and RoBERTa) when confronted with natural grammatical errors. Specifically, we collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. We use this approach to facilitate debugging models on downstream applications. Results confirm that the performance of all tested models is affected but the degree of impact varies. To interpret model … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 27 publications
0
1
0
Order By: Relevance
“…Transformer [35] architecture has achieved remarkable performance on many important Natural Language Processing (NLP) tasks, so the robustness of transformer has been studied on those NLP tasks. [17,19,29,22,12,43] conducted adversarial attacks on transformers including pretrained models, and in their experiments transformers usually show better robustness compared to models with structures such as LSTM or CNN, with a theoretical explanation provided in [17]. However, due to the discrete nature of NLP models, these studies are focusing on discrete perturbations (e.g., word or character substitutions) which are very different from small and continuous perturbations in computer vision tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Transformer [35] architecture has achieved remarkable performance on many important Natural Language Processing (NLP) tasks, so the robustness of transformer has been studied on those NLP tasks. [17,19,29,22,12,43] conducted adversarial attacks on transformers including pretrained models, and in their experiments transformers usually show better robustness compared to models with structures such as LSTM or CNN, with a theoretical explanation provided in [17]. However, due to the discrete nature of NLP models, these studies are focusing on discrete perturbations (e.g., word or character substitutions) which are very different from small and continuous perturbations in computer vision tasks.…”
Section: Related Workmentioning
confidence: 99%