2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.696
|View full text |Cite
|
Sign up to set email alerts
|

Learning from Noisy Large-Scale Datasets with Minimal Supervision

Abstract: We present an approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations. One common approach to combine clean and noisy data is to first pre-train a network using the large noisy dataset and then fine-tune with the clean dataset. We show this approach does not fully leverage the information contained in the clean set. Thus, we demonstrate how to use the clean annotations to reduce the noise in the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
261
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 420 publications
(270 citation statements)
references
References 22 publications
(22 reference statements)
0
261
0
Order By: Relevance
“…16: FaceScrub with noise level of 30% not rely on clean labels to remove the noise. More accurate solutions, which rely on clean labels during the training phase have thus been explored (e.g., [16], [22], [41]). These solutions generally train a separate network for distinguishing noisy labels from clean ones.…”
Section: Related Workmentioning
confidence: 99%
“…16: FaceScrub with noise level of 30% not rely on clean labels to remove the noise. More accurate solutions, which rely on clean labels during the training phase have thus been explored (e.g., [16], [22], [41]). These solutions generally train a separate network for distinguishing noisy labels from clean ones.…”
Section: Related Workmentioning
confidence: 99%
“…68.8 ± 0.9 69.8 ± 0.7 Lq,prune (n1 = 20|n1 = 15) 69.0 ± 0.6 70.2 ± 0.5 variants, that is, applying mixup to examples of the same batch after random permutation, or to examples of two different batches. No major differences are observed.…”
Section: Cce (Same As Baseline Inmentioning
confidence: 99%
“…However, such estimation is not trivial, and it assumes that the only possible type of noise is flipping labels. Other approaches use noise-robust loss functions to mitigate the effect of label noise [14], or leverage an additional set of curated data, for example to train a label cleaning network in order to reduce the noise of a dataset [15]. Conversely, learning with noisy labels has received little attention in sound recognition, probably given the traditional paradigm of learning from relatively small and exhaustively labeled (hence clean) datasets.…”
Section: Introductionmentioning
confidence: 99%
“…Most of the original methods for label adjustment adopt EM‐like algorithms to infer the true labels of training samples . Alternatively, noisy labels can also be adjusted by the predictions of deep models . The methods based on sample selection have obtained state‐of‐the‐art performance in addressing noisy labels .…”
Section: Introductionmentioning
confidence: 99%
“…[39][40][41] Alternatively, noisy labels can also be adjusted by the predictions of deep models. [42][43][44] The methods based on sample selection have obtained state-of-the-art performance in addressing noisy labels. 45,46 Specifically, Jiang et al proposed learning a data-driven curriculum that provides a sample weighting scheme for deep models trained on noisy labels.…”
Section: Introductionmentioning
confidence: 99%