2020
DOI: 10.48550/arxiv.2010.03622
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Abstract: Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks. However, the current theoretical understanding of self-training only applies to linear models. This work provides a unified theoretical analysis of selftraining with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning. At the core of our analysis is a simple but realist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
41
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 31 publications
(43 citation statements)
references
References 57 publications
0
41
0
Order By: Relevance
“…An alternative way to define φ(•) for sample x l is the discrepancy value of the model's prediction between the sample and its adversarial neighbor (Wei et al 2020). Adversarial neighbor is a sample which is similar to x l in terms of the input graph g l but has the most different prediction.…”
Section: Uncertainty Measurementsmentioning
confidence: 99%
“…An alternative way to define φ(•) for sample x l is the discrepancy value of the model's prediction between the sample and its adversarial neighbor (Wei et al 2020). Adversarial neighbor is a sample which is similar to x l in terms of the input graph g l but has the most different prediction.…”
Section: Uncertainty Measurementsmentioning
confidence: 99%
“…gorithm and show its convergence under proper initialization. Recent theoretical analysis (Wei et al 2020) and empirical evidence shows input consistency loss such as VAT loss (Miyato et al 2018) can further improve pseudolabeling in semi-supervised learning. Han et al (2019) points out that pseudolabel imputation can be viewed as minimizing minentropy as a type of Rényi entropy 1 1−α log( n i=1 p α i ) when α → ∞, and Shannon entropy in (Grandvalet and Bengio 2005) is the case when α → 1.…”
Section: Related Workmentioning
confidence: 99%
“…However, it has been shown that such a procedure may suffer from local minima (Grandvalet and Bengio 2005) or over-confident wrong pseudolabels (Zou et al 2019). Wei et al (2020) shows when the underlying data distribution and pseudolabeler satisfies expansion assumption (See Definition 3.1 and Assumption 4.1, 3.3 in Wei et al (2020)), self-training algorithms with input-consistency are able to achieve improvement from pseudolabeling (Theorem 4.3 in Wei et al (2020)). Intuitively, the condition states that there needs to be many correct neighbors around the errors made by pseudolabeler so that correct labels can refine the decision boundary.…”
Section: Algorithmmentioning
confidence: 99%
See 2 more Smart Citations