2021
DOI: 10.3390/s21196661
|View full text |Cite
|
Sign up to set email alerts
|

Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels with Overclustering and Inverse Cross-Entropy

Abstract: Deep learning has been successfully applied to many classification problems including underwater challenges. However, a long-standing issue with deep learning is the need for large and consistently labeled datasets. Although current approaches in semi-supervised learning can decrease the required amount of annotated data by a factor of 10 or even more, this line of research still uses distinct classes. For underwater classification, and uncurated real-world datasets in general, clean class boundaries can often… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 13 publications
(32 citation statements)
references
References 33 publications
0
4
0
Order By: Relevance
“…Schmarje et al reframes the handling of fuzzy labels as a semi-supervised learning problem by using a small set of certain images and a large number of fuzzy images that are treated as unlabeled data. 14 The authors apply the overclustering concept not only to improve classification accuracy on the labeled data, but improve the clustering, and therefore the identification, of substructures of fuzzy data (ambiguous, including intra or interoberver variability). Schmarje et al shows that their alternate inverse cross-entropy loss function and their overclustering allows them to cluster the fuzzy data to discover more meaningful substructures and therefore allow experts to analyze fuzzy images more consistently.…”
Section: Overclusteringmentioning
confidence: 99%
“…Schmarje et al reframes the handling of fuzzy labels as a semi-supervised learning problem by using a small set of certain images and a large number of fuzzy images that are treated as unlabeled data. 14 The authors apply the overclustering concept not only to improve classification accuracy on the labeled data, but improve the clustering, and therefore the identification, of substructures of fuzzy data (ambiguous, including intra or interoberver variability). Schmarje et al shows that their alternate inverse cross-entropy loss function and their overclustering allows them to cluster the fuzzy data to discover more meaningful substructures and therefore allow experts to analyze fuzzy images more consistently.…”
Section: Overclusteringmentioning
confidence: 99%
“…Other work [63,19,70] considers frameworks for learning from fuzzy human labels given possibly ambiguous data. Our dataset differs from these papers in that none of the listed datasets include images that are intentionally ambiguous, or depict more than a single object.…”
Section: Collecting Ambiguous Data In Computer Visionmentioning
confidence: 99%
“…To overcome this issue, we chose to use the so-called overclustering approach, i.e., we divided the dataset into large (𝐾 = 100) number of clusters. This method is also applied for data clustering with fuzzy labels [24], which is what our data essentially is. The number of clusters 𝐾 = 100 is chosen following the expert suggestion of expected particle groups (10), with the intention to partition the hidden representation feature space into finely defined clusters containing semantically homogeneous sets of examples.…”
Section: Clustering Approachmentioning
confidence: 99%