Unsupervised Data Augmentation for Consistency Training

Xie, Qizhe; Dai, Zihang; Hovy, Eduard; Luong, Minh-Thang; Le, Quoc V.

doi:10.48550/arxiv.1904.12848

Cited by 287 publications

(572 citation statements)

References 33 publications

Supporting

Mentioning

546

Contrasting

Order By: Relevance

“…Virtual Adversarial Training (VAT) (Miyato et al, 2018) uses an effective regularization technique that uses slight perturbations such that the prediction of the unlabeled samples is affected the most. More recent techniques like FixMatch (Sohn et al, 2020), MixMatch (Berthelot et al, 2019) and UDA (Xie et al, 2019) use data augmentations like flip, rotation, and crops to predict pseudo-labels. In this paper, we propose a new SSL technique that uses class-wise instantiations of SMI functions that mitigates the issue of class-imbalance in selected subsets and is comparatively robust to OOD classes in the unlabeled set.…”

Section: Related Workmentioning

confidence: 99%

PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

Li¹,

Kothawade²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

Few-shot classification (FSC) requires training models using a few (typically one to five) data points per class. Meta-learning has proven to be able to learn a parametrized model for FSC by training on various other classification tasks. In this work, we propose PLATINUM (semi-suPervised modeL Agnostic meTa learnIng usiNg sUbmodular Mutual information ), a novel semi-supervised model agnostic meta learning framework that uses the submodular mutual information (SMI) functions to boost the performance of FSC. PLATINUM leverages unlabeled data in the inner and outer loop using SMI functions during meta-training and obtains richer metalearned parameterizations. We study the performance of PLATINUM in two scenarios -1) where the unlabeled data points belong to the same set of classes as the labeled set of a certain episode, and 2) where there exist out-ofdistribution classes that do not belong to the labeled set. We evaluate our method on various settings on the miniImageNet, tieredImageNet and CIFAR-FS datasets. Our experiments show that PLATINUM outperforms MAML and semisupervised approaches like pseduo-labeling for semi-supervised FSC, especially for small ratio of labeled to unlabeled samples.

show abstract

Section: Related Workmentioning

confidence: 99%

PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

Li¹,

Kothawade²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…[25] systematically examined some basic augmentation methods including random synonyms replacement, word insertion, etc. [27] utilized tf-idf to help determine which words to replace. [23] adopted k-nearest neighbors to find synonyms in word embedding space.…”

Section: Data Augmentation For Natural Language Processing (Nlp)mentioning

confidence: 99%

“…Neural MWP solvers can benefit from our augmentation strategies in terms of generalization and the ability of dealing with tiny local variances. Unlike other popular augmentation approaches [25,27,30], which may cause inconsistency of the questions and equations in MWP task, our augmentation methods are carefully designed for MWP task to ensure consistency.…”

Section: Introductionmentioning

confidence: 99%

Semantic-based Data Augmentation for Math Word Problems

Li¹,

Liang²

2022

Preprint

View full text Add to dashboard Cite

It's hard for neural MWP solvers to deal with tiny local variances. In MWP task, some local changes conserve the original semantic while the others may totally change the underlying logic. Currently, existing datasets for MWP task contain limited samples which are key for neural models to learn to disambiguate different kinds of local variances in questions and solve the questions correctly. In this paper, we propose a set of novel data augmentation approaches to supplement existing datasets with such data that are augmented with different kinds of local variances, and help to improve the generalization ability of current neural models. New samples are generated by knowledge guided entity replacement, and logic guided problem reorganization. The augmentation approaches are ensured to keep the consistency between the new data and their labels. Experimental results have shown the necessity and the effectiveness of our methods.

show abstract

“…Temporal Ensembling [19] uses previous model checkpoints while Mean Teacher [26] uses an exponential moving average of model parameters. UDA [30] and ReMix-Match [18] sharpen the soft label to make the model to produce high-confidence predictions. UDA further reinforces consistency only when the highest probability of the predicted category distribution for soft labels is above a threshold.…”

Section: Semi-supervised Learningmentioning

confidence: 99%

“…However, the predictions made by the teacher could be noisy, especially at the beginning of the training process, which hinders the model from fitting to the supervised loss. A solution is to filter out low-quality predictions with the confidence-based masking strategy [30]. Specifically, we maintain a confident node set V 𝐶 ⊆ V 𝑈 in the training stage whose elements are unlabeled node with highly skewed predictions, i.e.,…”

Section: Additional Training Techniquesmentioning

confidence: 99%

SCR: Training Graph Neural Networks with Consistency Regularization

Zhang¹,

He²,

Cen³

et al. 2021

Preprint

View full text Add to dashboard Cite

Graph neural networks (GNNs) have achieved notable success in the semi-supervised learning scenario. The message passing mechanism in graph neural networks helps unlabeled nodes gather supervision signals from their labeled neighbors. In this work, we investigate how consistency regularization, one of widely adopted semi-supervised learning methods, can help improve the performance of graph neural networks. We revisit two methods of consistency regularization for graph neural networks. One is simple consistency regularization (SCR), and the other is mean-teacher consistency regularization (MCR). We combine the consistency regularization methods with two state-of-the-art GNNs and conduct experiments on the ogbn-products dataset 1 . With the consistency regularization, the performance of state-of-the-art GNNs can be improved by 0.3% on the ogbn-products dataset of Open Graph Benchmark (OGB) both with and without external data.

show abstract

Unsupervised Data Augmentation for Consistency Training

Cited by 287 publications

References 33 publications

PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

Semantic-based Data Augmentation for Math Word Problems

SCR: Training Graph Neural Networks with Consistency Regularization

Contact Info

Product

Resources

About