2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01070
|View full text |Cite
|
Sign up to set email alerts
|

Self-Training With Noisy Student Improves ImageNet Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

14
1,282
1
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 1,882 publications
(1,488 citation statements)
references
References 30 publications
14
1,282
1
2
Order By: Relevance
“…Recent studies adopt a teacher-student training paradigm, i.e., pseudolabels are generated by the teacher model on the unlabeled dataset, which is then combined with the labeled data and used to train or finetune the student model. For example, an iterative training scheme is proposed in [35], where the trained student model is used as the teacher model at the subsequent training round. The method outperforms the fully supervised counterpart on ImageNet by a large margin.…”
Section: B Learningmentioning
confidence: 99%
“…Recent studies adopt a teacher-student training paradigm, i.e., pseudolabels are generated by the teacher model on the unlabeled dataset, which is then combined with the labeled data and used to train or finetune the student model. For example, an iterative training scheme is proposed in [35], where the trained student model is used as the teacher model at the subsequent training round. The method outperforms the fully supervised counterpart on ImageNet by a large margin.…”
Section: B Learningmentioning
confidence: 99%
“…Offline training. Image classifiers used by companies, such Facebook or Google, are trained on tens of million of labeled images (Kuznetsova et al, 2020), or pretrained on billions of images (Mahajan et al, 2018;Xie et al, 2020), to reach the level of quality required by certain products. Natural language processing (NLP) systems for machine translation, or speech recognition systems such as BERT (Devlin et al, 2019), also require billions of samples to generalize and have descent performance for real applications.…”
mentioning
confidence: 99%
“…Most SSLs take a pseudo label learning approach [6,19,16,30]. For example, pseudo labeling [19] (also called self-training [36]) first trains using labeled data and then confident unlabeled data, which is larger than a predefined threshold, is added to the training data and the classifier is trained iteratively. Consistency regularization [21] generates pseudo labeled samples on the basis of the idea that a classifier should output the same class distribution for an unlabeled example even after it has been augmented.…”
Section: Related Workmentioning
confidence: 99%