2007
DOI: 10.1109/tkde.2007.190649
|View full text |Cite
|
Sign up to set email alerts
|

Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 63 publications
(51 citation statements)
references
References 34 publications
0
51
0
Order By: Relevance
“…In [7], OLIN, an online classification system, dynamically adjusts The recent research literature has proposed more tractable techniques for anomaly detection and classification [8,9,10,11]. These proposals rely on a common approach to data analysis: they apply dimensionality reduction techniques such as sketches [12,3] or principal components [13,14] to the aggregate network traffic.…”
Section: State Of the Artmentioning
confidence: 99%
“…In [7], OLIN, an online classification system, dynamically adjusts The recent research literature has proposed more tractable techniques for anomaly detection and classification [8,9,10,11]. These proposals rely on a common approach to data analysis: they apply dimensionality reduction techniques such as sketches [12,3] or principal components [13,14] to the aggregate network traffic.…”
Section: State Of the Artmentioning
confidence: 99%
“…The algorithm is particularly suitable for large high-dimensional databases, but it is sensitive to a user defined parameter (the repulsion factor), which weights the importance of the compactness/sparseness of a cluster. Other approaches [7], [8], [9], [10] extend the computation of frequencies to frequent patterns in the underlying data set. In particular, each transaction is seen as a relation over some sets of items, and a hyper-graph model is used for representing these relations.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years there has been an increasing interest to analyze categorical data in a data warehouse context where data sets are rather large and may have a high number of categorical dimensions [4,6,8,15]. However, many traditional techniques associated to the exploration of data sets assume the attributes have continuous data (covariance, density functions, PCA, etc.).…”
Section: The Need To Encodementioning
confidence: 99%