EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

Wei, Jason; Zou, Kai

doi:10.18653/v1/d19-1670

Cited by 1,092 publications

(858 citation statements)

References 24 publications

Supporting

Mentioning

845

Contrasting

Unclassified

Order By: Relevance

“…Although there are many augmentation methods for images, AutoAugment [29] is proposed to automatically search for augmentation policies based on the dataset. In addition to images, augmentation methods such as synonym replacement, random insertion, random swap, and random deletion are used for text classification [30], where the same accuracy as normal in all training data is achieved when only half of the training data is available. For speech recognition tasks, training audio is augmented by changing the audio speed [31], warping features, masking blocks of frequency channels, and masking blocks of time steps [32].…”

Section: B Data Augmentation In Deep Learningmentioning

confidence: 99%

Data Augmentation for Deep Learning-Based Radio Modulation Classification

et al. 2020

View full text Add to dashboard Cite

Deep learning has recently been applied to automatically classify the modulation categories of received radio signals without manual experience. However, training deep learning models requires massive volume of data. An insufficient training data will cause serious overfitting problem and degrade the classification accuracy. To cope with small dataset, data augmentation has been widely used in image processing to expand the dataset and improve the robustness of deep learning models. However, in wireless communication areas, the effect of different data augmentation methods on radio modulation classification has not been studied yet. In this paper, we evaluate different data augmentation methods via a state-ofthe-art deep learning-based modulation classifier. Based on the characteristics of modulated signals, three augmentation methods are considered, i.e., rotation, flip, and Gaussian noise, which can be applied in both training phase and inference phase of the deep learning algorithm. Numerical results show that all three augmentation methods can improve the classification accuracy. Among which, the rotation augmentation method outperforms the flip method, both of which achieve higher classification accuracy than the Gaussian noise method. Given only 12.5% of training dataset, a joint rotation and flip augmentation policy can achieve even higher classification accuracy than the baseline with initial 100% training dataset without augmentation. Furthermore, with data augmentation, radio modulation categories can be successfully classified using shorter radio samples, leading to a simplified deep learning model and shorter the classification response time.

show abstract

Section: B Data Augmentation In Deep Learningmentioning

confidence: 99%

Data Augmentation for Deep Learning-Based Radio Modulation Classification

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Left-in datasets were merged and randomly split again into training and validation batches in a 9:1-ratio, resulting in~3300 and 300 sentence pairs, respectively. The source sentences were augmented with random swap and random deletion operations, as described by [26] to improve model generalization further. As baseline models for predicting navigation steps at class-level ( Fig.…”

Section: Model Training and Baselinementioning

confidence: 99%

Language-based translation and prediction of surgical navigation steps for endoscopic wayfinding assistance in minimally invasive surgery

et al. 2020

View full text Add to dashboard Cite

Purpose In the context of aviation and automotive navigation technology, assistance functions are associated with predictive planning and wayfinding tasks. In endoscopic minimally invasive surgery, however, assistance so far relies primarily on image-based localization and classification. We show that navigation workflows can be described and used for the prediction of navigation steps. Methods A natural description vocabulary for observable anatomical landmarks in endoscopic images was defined to create 3850 navigation workflow sentences from 22 annotated functional endoscopic sinus surgery (FESS) recordings. Resulting FESS navigation workflows showed an imbalanced data distribution with over-represented landmarks in the ethmoidal sinus. A transformer model was trained to predict navigation sentences in sequence-to-sequence tasks. The training was performed with the Adam optimizer and label smoothing in a leave-one-out cross-validation study. The sentences were generated using an adapted beam search algorithm with exponential decay beam rescoring. The transformer model was compared to a standard encoder-decoder-model, as well as HMM and LSTM baseline models. Results The transformer model reached the highest prediction accuracy for navigation steps at 0.53, followed by 0.35 of the LSTM and 0.32 for the standard encoder-decoder-network. With an accuracy of sentence generation of 0.83, the prediction of navigation steps at sentence-level benefits from the additional semantic information. While standard class representation predictions suffer from an imbalanced data distribution, the attention mechanism also considered underrepresented classes reasonably well. Conclusion We implemented a natural language-based prediction method for sentence-level navigation steps in endoscopic surgery. The sentence-level prediction method showed a potential that word relations to navigation tasks can be learned and used for predicting future steps. Further studies are needed to investigate the functionality of path prediction. The prediction approach is a first step in the field of visuo-linguistic navigation assistance for endoscopic minimally invasive surgery.

show abstract

“…Another solution is to artificially augment the current dataset. This is a good practice method when working with image data [24]. Data augmentation involves different operations such as scaling, rotation, translation, flipping, resizing, adding noise, perspective transform, etc.…”

Section: Data Augmentationmentioning

confidence: 99%

Brain MRI Classification using Deep Learning Algorithm

Kulkarni,

Sundari

2020

IJEAT

View full text Add to dashboard Cite

The brain tumor is one of the most dangerous, common and aggressive diseases which leads to a very short life expectancy at the highest grade. Thus, to prevent life from such disease, early recognition, and fast treatment is an essential step. In this approach, MRI images are used to analyze brain abnormalities. The manual investigation of brain tumor classification is a time-consuming task and there might have possibilities of human errors. Hence accurate analysis in a tiny span of time is an essential requirement. In this approach, the automatic brain tumor classification algorithm using a highly accurate Convolutional Neural Network (CNN) algorithm is presented. Initially, the brain part is segmented by thresholding approach followed by a morphological operation. The AlexNet transfer learning network of CNN is used because of the limitation of the brain MRI dataset. The classification layer of Alexnet is replaced by the softmax layer with benign and malignant training images and trained using small weights. The experimental analysis demonstrates that the proposed system achieves the F-measure of 98.44% with low complexity than the state-of-arts method.

show abstract

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

Cited by 1,092 publications

References 24 publications

Data Augmentation for Deep Learning-Based Radio Modulation Classification

Data Augmentation for Deep Learning-Based Radio Modulation Classification

Language-based translation and prediction of surgical navigation steps for endoscopic wayfinding assistance in minimally invasive surgery

Brain MRI Classification using Deep Learning Algorithm

Contact Info

Product

Resources

About