2020
DOI: 10.3906/elk-1909-162
|View full text |Cite
|
Sign up to set email alerts
|

A supervised learning approach for detecting erroneous samples in embeddings

Abstract: Visualizing multidimensional data has been a crucial task in recent years regarding the growing amount of data from various sources. To achieve this, dimensionality reduction algorithms have been used to reduce the number of dimensions for visualization of the data on a screen. However, these algorithms may fail to faithfully represent high dimensional data in lower dimensions and eventually lead to erroneous visualizations. In this work, we propose an error detection algorithm for dimensionality reduction alg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 20 publications
(31 reference statements)
0
3
0
Order By: Relevance
“…However, when we examine an embedding perceptually, what we consider as an erroneous sample is the one that does not belong to the class of the majority of its neighbors. In a previous study [29], an error detection algorithm based on classification was presented for dimensionality reduction. We advocate that a binary classifier would be inferior than a regressor, since there is no threshold value suitable for every dataset.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, when we examine an embedding perceptually, what we consider as an erroneous sample is the one that does not belong to the class of the majority of its neighbors. In a previous study [29], an error detection algorithm based on classification was presented for dimensionality reduction. We advocate that a binary classifier would be inferior than a regressor, since there is no threshold value suitable for every dataset.…”
Section: Discussionmentioning
confidence: 99%
“…In the literature, there are many studies aiming to generate confidence and detect errors for various domains such as medical image registration [22][23][24][25] and stereo matching [26][27][28]. Recently, DR becomes also the focus of error estimation research [29]. Evaluating and comparing the embeddings are typically done qualitatively, by placing projections side by side and letting human judgment to determine which projection is the best.…”
Section: Introductionmentioning
confidence: 99%
“…Ranking-based metrics 5 , 6 focus on retaining local neighborhood rankings in high and low dimensions instead of considering the preservation of ground truth target labels (label-based). There are also some label-based error detection and confidence estimation methods that have been developed specifically for t-SNE embeddings 7 , 8 , in a similar way to those in many other domains such as medical image registration 9 , 10 and stereo matching 11 . What makes the label-based confidence estimation algorithm 8 unique is that it generates confidence scores for each and every sample in a t-SNE embedding with a supervised Random Forest (RF) regression algorithm based on target class labels.…”
Section: Introductionmentioning
confidence: 99%