RealMix: Towards Realistic Semi-Supervised Deep Learning Algorithms

Nair, Varun Sasidharan; Alonso, Javier; Beltramelli, Tony

doi:10.48550/arxiv.1912.08766

Cited by 9 publications

(18 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We also conduct experiments on Tiny ImageNet 1 to verify the performance of our method on a larger dataset. Our method is compared against Mix-Match (Berthelot et al, 2019b), RealMix (Nair et al, 2019), ReMixMatch (Berthelot et al, 2019a), and FixMatch (Sohn et al, 2020). As recommended by Oliver et al (2018), all methods should be implemented using the same codebase.…”

Section: Methodsmentioning

confidence: 99%

“…As presented in Section 4, we evaluate our method against four methods: MixMatch (Berthelot et al, 2019b), RealMix (Nair et al, 2019), ReMixMatch (Berthelot et al, 2019a), and FixMatch (Sohn et al, 2020). The comparison of the methods is shown in Table 6.…”

Section: B Comparison Of Methodsmentioning

confidence: 99%

“…In many real-world problems, it is often very difficult to create a large amount of labeled training data. Therefore, numerous studies have focused on how to leverage unlabeled data, leading to a variety of research fields like self-supervised learning (Doersch et al, 2015;Noroozi & Favaro, 2016;Gidaris et al, 2018), semi-supervised learning (Berthelot et al, 2019b;Nair et al, 2019;Berthelot et al, 2019a;Sohn et al, 2020), or metric learning (Hermans et al, 2017;Zhang et al, 2019). In self-supervised learning, pretext tasks are designed so that the model can learn meaningful information from a large number of unlabeled images.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss

Tran,

Kang,

Kim

2021

Preprint

View full text Add to dashboard Cite

Semi-supervised learning (SSL) has played an important role in leveraging unlabeled data when labeled data is limited. One of the most successful SSL approaches is based on consistency regularization, which encourages the model to produce unchanged with perturbed input. However, there has been less attention spent on inputs that have the same label. Motivated by the observation that the inputs having the same label should have the similar model outputs, we propose a novel method, RankingMatch, that considers not only the perturbed inputs but also the similarity among the inputs having the same label. We especially introduce a new objective function, dubbed BatchMean Triplet loss, which has the advantage of computational efficiency while taking into account all input samples. Our RankingMatch achieves state-of-the-art performance across many standard SSL benchmarks with a variety of labeled data amounts, including 95.13% accuracy on CIFAR-10 with 250 labels, 77.65% accuracy on CIFAR-100 with 10000 labels, 97.76% accuracy on SVHN with 250 labels, and 97.77% accuracy on SVHN with 1000 labels. We also perform an ablation study to prove the efficacy of the proposed BatchMean Triplet loss against existing versions of Triplet loss.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: B Comparison Of Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss

Tran,

Kang,

Kim

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…For semi-supervised learning, it is all about learning the underlying structure of a large amount of unlabeled data [30]. So far, we only rely on the phylogenetic tree to build a common label space shared by in-distribution and out-of-distribution data.…”

Section: Naive Pseudo-labeling Via Relation Predictionmentioning

confidence: 99%

Clue Me In: Semi-Supervised FGVC with Out-of-Distribution Data

Du¹,

Chang²,

Ma³

et al. 2021

Preprint

View full text Add to dashboard Cite

Despite great strides made on fine-grained visual classification (FGVC), current methods are still heavily reliant on fully-supervised paradigms where ample expert labels are called for. Semi-supervised learning (SSL) techniques, acquiring knowledge from unlabeled data, provide a considerable means forward and have shown great promise for coarse-grained problems. However, exiting SSL paradigms mostly assume in-distribution (i.e., category-aligned) unlabeled data, which hinders their effectiveness when reproposed on FGVC. In this paper, we put forward a novel design specifically aimed at making out-of-distribution data work for semi-supervised FGVC, i.e., to "clue them in". We work off an important assumption that all fine-grained categories naturally follow a hierarchical structure (e.g., the phylogenetic tree of "Aves" that covers all bird species). It follows that, instead of operating on individual samples, we can instead predict sample relations within this tree structure as the optimization goal of SSL. Beyond this, we further introduced two strategies uniquely brought by these tree structures to achieve inter-sample consistency regularization and reliable pseudo-relation. Our experimental results reveal that (i) the proposed method yields good robustness against out-of-distribution data, and (ii) it can be equipped with prior arts, boosting their performance thus yielding state-of-the-art results. Code is available at https://github.com/PRIS-CV/RelMatch.

show abstract

“…Additional unlabelled observations are generated by perturbing the original set of unlabelled observations by adding random noise or transformations such as rotations or translations. Data augmentation is typically coupled with consistency regularization so that similar predictions are encouraged on the original instances and the augmented versions (Berthelot et al, 2019;Nair et al, 2019;Wei et al, 2021). The combined use of small local alterations and more aggressive global changes has been found to be an effective strategy (Sohn et al, 2020).…”

Section: Brief Overview Of Ssl Approachesmentioning

confidence: 99%

Semi-Supervised Learning of Classifiers from a Statistical Perspective: A Brief Review

Ahfock¹,

McLachlan²

2021

Preprint

View full text Add to dashboard Cite

There has been increasing attention to semi-supervised learning (SSL) approaches in machine learning to forming a classifier in situations where the training data for a classifier consists of a limited number of classified observations but a much larger number of unclassified observations. This is because the procurement of classified data can be quite costly due to high acquisition costs and subsequent financial, time, and ethical issues that can arise in attempts to provide the true class labels for the unclassified data that have been acquired. We provide here a review of statistical SSL approaches to this problem, focussing on the recent result that a classifier formed from a partially classified sample can actually have smaller expected error rate than that if the sample were completely classified. This rather paradoxical outcome is able to be achieved by introducing a framework with a missingness mechanism for the missing labels of the unclassified observations. It is most relevant in commonly occurring situations in practice, where the unclassified data occur primarily in regions of relatively high entropy in the feature space thereby making it difficult for their class labels to be easily obtained.

show abstract

RealMix: Towards Realistic Semi-Supervised Deep Learning Algorithms

Cited by 9 publications

References 5 publications

RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss

RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss

Clue Me In: Semi-Supervised FGVC with Out-of-Distribution Data

Semi-Supervised Learning of Classifiers from a Statistical Perspective: A Brief Review

Contact Info

Product

Resources

About