2019
DOI: 10.48550/arxiv.1912.08741
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Robust Learning with Different Label Noise Distributions

Abstract: Noisy labels are an unavoidable consequence of automatic image labeling processes to reduce human supervision. Training in these conditions leads Convolutional Neural Networks to memorize label noise and degrade performance. Noisy labels are therefore dispensable, while image content can be exploited in a semi-supervised learning (SSL) setup. Handling label noise then becomes a label noise detection task. Noisy/clean samples are usually identified using the small loss trick, which is based on the observation t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(15 citation statements)
references
References 35 publications
0
15
0
Order By: Relevance
“…For a short time learning, the two could be similar. Among much work on minimizing large current loss (i.e., Fan et al (2017); Zhang et al (2019a); Jiang et al (2019); Ortego et al (2019) and reference therein), Kawaguchi & Lu (2020) introduced ordered stochastic gradient descent (SGD), which purposely biased toward those instances with higher current losses. They empirically show that the ordered SGD outperforms the standard SGD (Bottou et al, 2018) when decreasing the learning rate, even though it is not better initially.…”
Section: Discussionmentioning
confidence: 99%
“…For a short time learning, the two could be similar. Among much work on minimizing large current loss (i.e., Fan et al (2017); Zhang et al (2019a); Jiang et al (2019); Ortego et al (2019) and reference therein), Kawaguchi & Lu (2020) introduced ordered stochastic gradient descent (SGD), which purposely biased toward those instances with higher current losses. They empirically show that the ordered SGD outperforms the standard SGD (Bottou et al, 2018) when decreasing the learning rate, even though it is not better initially.…”
Section: Discussionmentioning
confidence: 99%
“…SSL methods [30,42] first classify samples as clean or noisy, where the noisy samples are re-labelled by the model, and these clean and noisy sets are combined with MixMatch [4]. As mentioned before, SSL methods have two issues: 1) the training set size for the MixMatch stage is limited by the clean set size that reduces with increasing label noise, and 2) the noisy sample re-labelling accuracy also reduces with increasing label noise.…”
Section: Prior Workmentioning
confidence: 99%
“…Moreover, in high noise rate scenarios, the filtered clean set can be too small to train the model. The noisy set can be used after re-labelling the noisy samples [30,42,50], which works well for low noise rate problems, but as the noise rate increases, this approach is not effective given that the incorrectly re-labelled noisy samples tend to bias the training. The use of clean validation sets in a meta-learning approach [2] can reduce this issue, but the existence of a clean validation set may be infeasible in real world applications.…”
Section: Prior Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Automatic annotation of data becomes a plausible answer [16] that unavoidably infers some incorrect or noisy labels. To prevent harming the representations learned [17], label noise-resistant training of CNNs is often necessary [18], [19], [20], [21]. In particular, the small loss trick [17] associates examples with a low (high) training loss to samples with clean (noisy) labels.…”
Section: Introductionmentioning
confidence: 99%