Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 2021
DOI: 10.18653/v1/2021.findings-acl.84
|View full text |Cite
|
Sign up to set email alerts
|

A Survey of Data Augmentation Approaches for NLP

Abstract: Data augmentation has recently seen increased interest in NLP due to more work in lowresource domains, new tasks, and the popularity of large-scale neural networks that require large amounts of training data. Despite this recent upsurge, this area is still relatively underexplored, perhaps due to the challenges posed by the discrete nature of language data. In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner. We first i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
60
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 257 publications
(84 citation statements)
references
References 124 publications
(83 reference statements)
0
60
0
1
Order By: Relevance
“…Common image augmentation approaches include copying and warping an image, i.e via cropping and rotation 8 . NLP augmentation techniques may include copying a sentence and substituting words with synonyms to preserve meaning or translating a sentence into another language and back again [18][19][20] .…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Common image augmentation approaches include copying and warping an image, i.e via cropping and rotation 8 . NLP augmentation techniques may include copying a sentence and substituting words with synonyms to preserve meaning or translating a sentence into another language and back again [18][19][20] .…”
Section: Introductionmentioning
confidence: 99%
“…Common image augmentation approaches include copying and warping an image, i.e via cropping and rotation 8 . NLP augmentation techniques may include copying a sentence and substituting words with synonyms to preserve meaning or translating a sentence into another language and back again 1820 . Additionally synthetic data can be generated through a variety of techniques including Generative Adversarial Networks (GANs) and the Synthetic Minority Oversampling Technique (SMOTE) 8,21 .…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In other domains, researchers have developed various data augmentation techniques to overcome this bottleneck, enhancing the generalization of deep networks given limited data. For example, data augmentation has been used in computer vision [2], natural language processing [3], and semi-supervised learning [4].…”
Section: Introductionmentioning
confidence: 99%
“…Large numbers of DA methods have been proposed recently, and a survey of existing methods is beneficial so that researchers could keep up with the speed of innovation. Liu et al [2] and Feng et al [3] both present surveys that give a bird's eye view of DA for NLP. They directly divide the categories according to the methods.…”
Section: Introductionmentioning
confidence: 99%