2006
DOI: 10.1007/11871842_48
|View full text |Cite
|
Sign up to set email alerts
|

Subspace Metric Ensembles for Semi-supervised Clustering of High Dimensional Data

Abstract: Abstract.A critical problem in clustering research is the definition of a proper metric to measure distances between points. Semi-supervised clustering uses the information provided by the user, usually defined in terms of constraints, to guide the search of clusters. Learning effective metrics using constraints in high dimensional spaces remains an open challenge. This is because the number of parameters to be estimated is quadratic in the number of dimensions, and we seldom have enough sideinformation to ach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2008
2008
2013
2013

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…RSM is based on random sampling for original feature components to obtain different feature subsets. In recent years, it has been applied to FS, clustering, and other areas. When it is used for FS, it often finds the optimal result by evaluating a predefined number of features. It is a crucial step in determining the dimensions of the subspace in RSM.…”
Section: The Proposed Methodsmentioning
confidence: 99%
“…RSM is based on random sampling for original feature components to obtain different feature subsets. In recent years, it has been applied to FS, clustering, and other areas. When it is used for FS, it often finds the optimal result by evaluating a predefined number of features. It is a crucial step in determining the dimensions of the subspace in RSM.…”
Section: The Proposed Methodsmentioning
confidence: 99%
“…There are even approaches that connect this idea with the idea of having constraints (see Sect. 2.3) that can guide the distance-learning (Yan and Domeniconi 2006). Let us note that, for this general approach of learning one (combined) result based on several representations, strong connections to ensemble clustering (Sect.…”
Section: Multiview Clusteringmentioning
confidence: 99%
“…Furthermore, learning an effective full rank distance metric by using constraints in highdimensional spaces is impracticable since (a) the number of parameters to be estimated is the square of the dimensionality, and (b) typically insufficient side information is available in order to obtain accu-rate estimates. A typical solution to this problem is to reduce the dimensionality and to modify the distance metric in the reduced space, as in (Yan and Domeniconi, 2006). However, important information may be lost during a completely unsupervised dimension reduction (that does not use the side information) which may degrade the subsequent metric learning.…”
Section: Introductionmentioning
confidence: 99%