On the Robustness of Language Encoders against Grammatical Errors

Yin, Fan; Long, Quanyu; Meng, Tao; Chang, Kai-Wei

doi:10.18653/v1/2020.acl-main.310

Cited by 21 publications

(9 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This fixed adversarial test set will then be used to evaluate all the new models. This evaluation setup has been adopted in (Ren et al, 2019;Tan et al, 2020;Yin et al, 2020;Wang et al, 2020b;Zou et al, 2020;Wang et al, 2021, inter alia. ).…”

Section: Robustness Evaluationmentioning

confidence: 99%

“…However, they are limited in practice as they are not generally applicable to other types of attacks. The other type of defense is Adversarial Data Augmentation (ADA), which augments the training set by the adversarial examples and is widely used in the training (finetuning) process to enhance model robustness (Alzantot et al;Ren et al, 2019;Jin et al, 2020;Li et al, 2020;Tan et al, 2020;Yin et al, 2020;Zheng et al, 2020;Zou et al, 2020;Wang et al, 2020b). ADA is generally applicable to any type of adversarial attacks but is not very effective in improving model performance under attacks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning

Si¹,

Zhang²,

Qi³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Pretrained language models (PLMs) perform poorly under adversarial attacks. To improve the adversarial robustness, adversarial data augmentation (ADA) has been widely adopted to cover more search space of adversarial attacks by adding textual adversarial examples during training. However, the number of adversarial examples for text augmentation is still extremely insufficient due to the exponentially large attack search space. In this work, we propose a simple and effective method to cover a much larger proportion of the attack search space, called Adversarial and Mixup Data Augmentation (AMDA). Specifically, AMDA linearly interpolates the representations of pairs of training samples to form new virtual samples, which are more abundant and diverse than the discrete text adversarial examples in conventional ADA. Moreover, to fairly evaluate the robustness of different models, we adopt a challenging evaluation setup, which generates a new set of adversarial examples targeting each model. In text classification experiments of BERT and RoBERTa, AMDA achieves significant robustness gains under two strong adversarial attacks and alleviates the performance degradation of ADA on the clean data. Our code is available at: https://github.com/thunlp/MixADA.

show abstract

Section: Robustness Evaluationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning

Si¹,

Zhang²,

Qi³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

show abstract

“…The ability of these models to detect and locate grammatical errors was studied in [ 23 ]. Error examples were taken from the NUCLE (NUS Corpus of Learner English) dataset consisting of pairs of erroneous and correct sentences.…”

Section: Related Workmentioning

confidence: 99%

Automatic Correction of Real-Word Errors in Spanish Clinical Texts

Bravo-Candel

López-Hernández

García-Díaz

et al. 2021

Sensors

View full text Add to dashboard Cite

Real-word errors are characterized by being actual terms in the dictionary. By providing context, real-word errors are detected. Traditional methods to detect and correct such errors are mostly based on counting the frequency of short word sequences in a corpus. Then, the probability of a word being a real-word error is computed. On the other hand, state-of-the-art approaches make use of deep learning models to learn context by extracting semantic features from text. In this work, a deep learning model were implemented for correcting real-word errors in clinical text. Specifically, a Seq2seq Neural Machine Translation Model mapped erroneous sentences to correct them. For that, different types of error were generated in correct sentences by using rules. Different Seq2seq models were trained and evaluated on two corpora: the Wikicorpus and a collection of three clinical datasets. The medicine corpus was much smaller than the Wikicorpus due to privacy issues when dealing with patient information. Moreover, GloVe and Word2Vec pretrained word embeddings were used to study their performance. Despite the medicine corpus being much smaller than the Wikicorpus, Seq2seq models trained on the medicine corpus performed better than those models trained on the Wikicorpus. Nevertheless, a larger amount of clinical text is required to improve the results.

show abstract

“…The vulnerability of modern neural networks towards human imperceptible input variations has been studied for a while since (Szegedy et al, 2013), primarily in the computer vision community (e.g., (Goodfellow et al, 2015)), later extended to the NLP community (e.g., (Ebrahimi et al, 2017;Liang et al, 2017;Yin et al, 2020;Jones et al, 2020;Jia et al, 2019;Liu et al, 2019;Pruthi et al, 2019)). Recent studies suggest that the fragility of neural networks roots in that the data has multiple signals that can reduce the empirical risk, and when a model is forced to reduce the training error, it picks up whatever information that diminish the empirical loss, ignoring whether the learnt knowledge aligns with human perception or not (Wang et al, 2019b), connecting the adversarial robustness problems and the bias in data problems that has been studied for a while (e.g., (Wang et al, 2016;Goyal et al, 2017;Kaushik and Lipton, 2018;Wang et al, 2019a)).…”

Section: Related Workmentioning

confidence: 99%

Word Shape Matters: Robust Machine Translation with Visual Embedding

Wang¹,

Zhang²,

Xing³

2020

Preprint

View full text Add to dashboard Cite

Neural machine translation has achieved remarkable empirical performance over standard benchmark datasets, yet recent evidence suggests that the models can still fail easily dealing with substandard inputs such as misspelled words, To overcome this issue, we introduce a new encoding heuristic of the input symbols for character-level NLP models: it encodes the shape of each character through the images depicting the letters when printed. We name this new strategy visual embedding and it is expected to improve the robustness of NLP models because human also process the corpus visually through printed letters, instead of machinery one-hot vectors. Empirically, our method improves models' robustness against substandard inputs, even in the test scenario where the models are tested with the noises that are beyond what is available during the training phase.1 assuming the models are expected to show human-level resilience towards substandard inputs.

show abstract

On the Robustness of Language Encoders against Grammatical Errors

Cited by 21 publications

References 30 publications

Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning

Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning

Automatic Correction of Real-Word Errors in Spanish Clinical Texts

Word Shape Matters: Robust Machine Translation with Visual Embedding

Contact Info

Product

Resources

About