Optimal dimensionality of metric space for classification

Zhang, Wei; Xue, Xiangyang; Sun, Zichen; Guo, Youguang; Lu, Hong

doi:10.1145/1273496.1273639

Cited by 37 publications

(49 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [12], a metric space dimension reduction technique is proposed. The idea is to find a linear transform such that in the transformed space total within class distance is minimized, while total between class distance is maximized.…”

Section: Related Workmentioning

confidence: 99%

A Margin Technique for Dimension Reduction with Applications to Hyperspectral Imagery

Peng¹,

Zhang²

2013

Proceedings of the 2013 International Conference on Advanced Computer Science and Electronics Information

View full text Add to dashboard Cite

-Target classification in hyperspectral imagery has been demonstrated to be very useful in remote-sensing applications. While spectral bands provide information for classification, they give rise to a large number of features. However, a large number of features often degrade performance. In such situations, dimensionality reduction can be very helpful. There are many such techniques in the literature, and the most popular one is Fisher's linear discriminant analysis (LDA). For two class problems, LDA can be shown to be optimal. For the multi-class case, LDA is not. As such, a multi-class problem is cast into a binary one. This formulation not only simplifies the problem but also works well in practice. However, it lacks theoretical justification. We show in this paper the connection between the above formulation and Relief feature selection, thereby providing a sound basis for observed benefits associated with this formulation. Furthermore, we propose a margin based algorithm for dimensionality reduction that addresses some of the problems facing the two class formulation. We provide experimental results that corroborate well with our analysis.Index Terms -Classification, dimensionality reduction, Relief I . IntroductionTarget classification in hyperspectral imagery has been demonstrated to be very challenging, and at the same time to be extremely useful in many remote-sensing applications [1], [2], [3]. While spectral-reflectance measurements provide information for target detection and classification, they generate a large number of features, resulting in a high dimensional measurement space [4]. However, a large number of features often degrade classification performance. This fact is due to the curse of dimensionality. In such situations, feature extraction or selection methods play an important role by significantly reducing the number of features for building classifiers.There are many dimensionality reduction techniques for classification in the literature. The most popular one is Fisher's linear discriminant analysis (LDA) [5]. In LDA, we are given a set of N examples z = {(, where x i ∈ ℜ q are the qdimensional inputs, and y i are scalar labels. Consider a C class problem, where m is the mean vector of all data, and m i is the mean vector of ith class data. A within-class matrix characterizes the scatter of samples around their respective class mean vectors, and it is expressed by , where is the number of examples in the ith class, p i (∑ i p i = 1) represents the pro-portion of class i, and t denotes matrix transpose. A between-class scatter matrix characterizes the scatter of the class means around the overall mean m:t . Thus, LDA finds the projection matrix that maximizes the objectiveWe can obtain W that maximizes J(W) by solving the generalized eigenvalue problem: S b w i = λ i S w w i .From the Bayes perspective, LDA is optimal for two Gaussians with equal covariances [6], [7]. However, LDA is not optimal for multiple Gaussian distributions or classes with unequal covariance matrices. To...

show abstract

Section: Related Workmentioning

confidence: 99%

A Margin Technique for Dimension Reduction with Applications to Hyperspectral Imagery

Peng¹,

Zhang²

2013

Proceedings of the 2013 International Conference on Advanced Computer Science and Electronics Information

View full text Add to dashboard Cite

show abstract

“…Yang et al [31] propose a Local Distance Metric (LDM) that addresses multimodal data distributions in distance metric learning by optimizing local compactness and local separability in a probabilistic framework. Finally, a number of recent studies [28], [32], [33], [34], [35], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49] focus on examining and exploring the relationship among metric learning, dimensionality reduction, kernel learning, semi-supervised learning, and Bayesian learning.…”

Section: Related Workmentioning

confidence: 99%

A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval

Yang

Jin

Mummert

et al. 2010

IEEE Trans. Pattern Anal. Mach. Intell.

153

View full text Add to dashboard Cite

Abstract-Similarity measurement is a critical component in content-based image retrieval systems, and learning a good distance metric can significantly improve retrieval performance. However, despite extensive study, there are several major shortcomings with the existing approaches for distance metric learning that can significantly affect their application to medical image retrieval. In particular, "similarity" can mean very different things in image retrieval: resemblance in visual appearance (e.g., two images that look like one another) or similarity in semantic annotation (e.g., two images of tumors that look quite different yet are both malignant). Current approaches for distance metric learning typically address only one goal without consideration of the other. This is problematic for medical image retrieval where the goal is to assist doctors in decision making. In these applications, given a query image, the goal is to retrieve similar images from a reference library whose semantic annotations could provide the medical professional with greater insight into the possible interpretations of the query image. If the system were to retrieve images that did not look like the query, then users would be less likely to trust the system; on the other hand, retrieving images that appear superficially similar to the query but are semantically unrelated is undesirable because that could lead users toward an incorrect diagnosis. Hence, learning a distance metric that preserves both visual resemblance and semantic similarity is important. We emphasize that, although our study is focused on medical image retrieval, the problem addressed in this work is critical to many image retrieval systems. We present a boosting framework for distance metric learning that aims to preserve both visual and semantic similarities. The boosting framework first learns a binary representation using side information, in the form of labeled pairs, and then computes the distance as a weighted Hamming distance using the learned binary representation. A boosting algorithm is presented to efficiently learn the distance function. We evaluate the proposed algorithm on a mammographic image reference library with an Interactive Search-Assisted Decision Support (ISADS) system and on the medical image data set from ImageCLEF. Our results show that the boosting framework compares favorably to state-of-the-art approaches for distance metric learning in retrieval accuracy, with much lower computational cost. Additional evaluation with the COREL collection shows that our algorithm works well for regular image data sets.

show abstract

“…Here, following the previous works in the supervised setting [11,1,2], the nearest neighbor algorithm is used for representing a simple classifier mentioned in the goal. Note that important special cases of SSL problems are transductive problems where we only want to predict the labels {y i } ℓ+u i=ℓ+1 of the given unlabeled examples.…”

Section: The Frameworkmentioning

confidence: 99%

“…The specification of "nearby examples" has been proven to be successful in discovering manifold and multi-modal structure [11,1,2,3,4,5,6,18,19,20,21,22]. See Figure 2 for explanations.…”

Section: Specification Of the Cost And Constraint Matricesmentioning

confidence: 99%

See 1 more Smart Citation

A unified semi-supervised dimensionality reduction framework for manifold learning

Chatpatanasiri

Kijsirikul

2010

Neurocomputing

View full text Add to dashboard Cite

We present a general framework of semi-supervised dimensionality reduction for manifold learning which naturally generalizes existing supervised and unsupervised learning frameworks which apply the spectral decomposition. Algorithms derived under our framework are able to employ both labeled and unlabeled examples and are able to handle complex problems where data form separate clusters of manifolds. Our framework offers simple views, explains relationships among existing frameworks and provides further extensions which can improve existing algorithms. Furthermore, a new semi-supervised kernelization framework called "KPCA trick" is proposed to handle non-linear problems.

show abstract

Optimal dimensionality of metric space for classification

Cited by 37 publications

References 9 publications

A Margin Technique for Dimension Reduction with Applications to Hyperspectral Imagery

A Margin Technique for Dimension Reduction with Applications to Hyperspectral Imagery

A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval

A unified semi-supervised dimensionality reduction framework for manifold learning

Contact Info

Product

Resources

About