2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022
DOI: 10.1109/wacv51458.2022.00245
|View full text |Cite
|
Sign up to set email alerts
|

Addressing out-of-distribution label noise in webly-labelled data

Abstract: A recurring focus of the deep learning community is towards reducing the labeling effort. Data gathering and annotation using a search engine is a simple alternative to generating a fully human-annotated and human-gathered dataset. Although web crawling is very time efficient, some of the retrieved images are unavoidably noisy, i.e. incorrectly labeled. Designing robust algorithms for training on noisy data gathered from the web is an important research perspective that would render the building of datasets ea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(25 citation statements)
references
References 34 publications
0
25
0
Order By: Relevance
“…Up to now, very few studies address this issue, no efficient and effective solution can handle it either. 61.91 Co-teaching+ 47 81.61 JoCoR 34 77.94 Jo-SRC 50 86.66 Co-learning 49 87.57 DSOS 39 87.70 We think in the future study on label learning, it might be wise to focus more on these hard samples. We define "hard data" as data that distribute close to the decision boundary.…”
Section: Results Listed In Tablementioning
confidence: 99%
See 1 more Smart Citation
“…Up to now, very few studies address this issue, no efficient and effective solution can handle it either. 61.91 Co-teaching+ 47 81.61 JoCoR 34 77.94 Jo-SRC 50 86.66 Co-learning 49 87.57 DSOS 39 87.70 We think in the future study on label learning, it might be wise to focus more on these hard samples. We define "hard data" as data that distribute close to the decision boundary.…”
Section: Results Listed In Tablementioning
confidence: 99%
“…Then, it applies semi-supervised learning for the noisy subset without using the given noisy labels. DSOS 39 uses the entropy of the interpolation of prediction and given label to distinguish clean, in-distribution (ID) noise and out-of-distribution (OOD) noise. Then, it corrects the labels for ID samples and proposes a dynamic softening strategy for OOD samples to lower the harm of noisy labels.…”
Section: Existing Methods Of Noisy Label Learningmentioning
confidence: 99%
“…Webvision [24] is a dataset constituted of 2.4 million images gathered using search queries on the same 1k classes as the ILSVRC12 [20] challenge. Albert et al [1] have shown that the noise present in Webvision is predominantly outof-distribution. Clothing1M [44] is a 1M images clothes classification dataset, popular dataset in the label noise community which, as shown by the authors, only contains ID noise.…”
Section: Web-crawled Datasetsmentioning
confidence: 99%
“…Another class of noise robust algorithms have emerged recently to tackle noisy datasets presenting both in-and out-of-distribution noise. EvidentialMix [32] and DSOS [1] differentiates between ID noisy and OOD data using a custom noise landscape and JoSRC [46] evidences OOD samples as having a low agreement between two consistent views of the same image. The methods we compare against are described in more details in 4.1 and we direct the interested reader to a recent label noise survey by Song et al [15] for an in-depth overview of state-of-the-art label noise robust algorithms.…”
Section: Tackling Id and Ood Noisementioning
confidence: 99%
See 1 more Smart Citation