2018
DOI: 10.1093/imaiai/iay011
|View full text |Cite
|
Sign up to set email alerts
|

Mahalanobis distance informed by clustering

Abstract: A fundamental question in data analysis, machine learning and signal processing is how to compare between data points. The choice of the distance metric is specifically challenging for high-dimensional data sets, where the problem of meaningfulness is more prominent (e.g. the Euclidean distance between images). In this paper, we propose to exploit a property of highdimensional data that is usually ignored -which is the structure stemming from the relationships between the coordinates. Specifically we show that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…categorical variable, then a two-stage cluster analysis is recommended. 3 It is especially important to take into account whether there are nonstandard observations and whether they should be excluded, as well as whether standardization of variables is required. Non-standard observations can be observations that are not representative of the population, but that are representative of the specific sample and research problem.…”
Section: Research Design In Cluster Analysismentioning
confidence: 99%
“…categorical variable, then a two-stage cluster analysis is recommended. 3 It is especially important to take into account whether there are nonstandard observations and whether they should be excluded, as well as whether standardization of variables is required. Non-standard observations can be observations that are not representative of the population, but that are representative of the specific sample and research problem.…”
Section: Research Design In Cluster Analysismentioning
confidence: 99%
“…categorical variable, then a two-stage cluster analysis is recommended. 3 It is especially important to take into account whether there are nonstandard observations and whether they should be excluded, as well as whether standardization of variables is required. Non-standard observations can be observations that are not representative of the population, but that are representative of the specific sample and research problem.…”
Section: Research Design In Cluster Analysismentioning
confidence: 99%
“…The Moore-Penrose pseudo-inverse W − is commonly used in cases where the covariance matrix is not invertible, see Wei et al [41] and Lahav et al [22], for example. This pseudo-inverse is constructed using the nonzero eigenvalues and corresponding eigenvectors of the covariance matrix W , and satisfies the four Moore-Penrose conditions [17].…”
Section: Introductionmentioning
confidence: 99%
“…Efficiencies (21) and(22) for different k, with three different sets of eigenvalues of the covariance matrix W as given in Table1in Sect. 3.1.…”
mentioning
confidence: 99%