The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2003
DOI: 10.1287/ijoc.15.2.208.14448
|View full text |Cite
|
Sign up to set email alerts
|

Relationship-Based Clustering and Visualization for High-Dimensional Data Mining

Abstract: I n several real-life data-mining applications, data reside in very high (1000 or more) dimensional space, where both clustering techniques developed for low-dimensional spaces (k-means, BIRCH, CLARANS, CURE, DBScan, etc.) as well as visualization methods such as parallel coordinates or projective visualizations, are rendered ineffective. This paper proposes a relationship-based approach that alleviates both problems, side-stepping the "curseof-dimensionality" issue by working in a suitable similarity space in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
188
0
3

Year Published

2006
2006
2022
2022

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 167 publications
(191 citation statements)
references
References 45 publications
0
188
0
3
Order By: Relevance
“…To evaluate the results of the test cases we used external quality measures described in [Str02], purity, F-measure, entropy and mutual information. These measures are defined as follows.…”
Section: Discussionmentioning
confidence: 99%
“…To evaluate the results of the test cases we used external quality measures described in [Str02], purity, F-measure, entropy and mutual information. These measures are defined as follows.…”
Section: Discussionmentioning
confidence: 99%
“…[Strehl, 2002] compares several metrics according to their different biases and scaling properties: purity and entropy are extreme cases where the bias is towards small clusters, because they reach a maximal value when all clusters are of size one. Combining precision and recall via a balanced F measure, on the other hand, favors coarser clusterings, and random clusterings do not receive zero values (which is a scaling problem).…”
Section: Motivationmentioning
confidence: 99%
“…To compute purity (Strehl 2002), each induced phone ω l is assigned to the labeled phone c i whose frames are most frequent in ω l , and then the accuracy of this assignment is measured by counting the number of correctly assigned frames and dividing by the total number of frames N :…”
Section: Evaluation Measures For Segmentationmentioning
confidence: 99%