2001
DOI: 10.1007/3-540-45357-1_31
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Hierarchical Clustering Method for Sequences of Categorical Values

Abstract: Abstract. Data clustering methods have many applications in the area of data mining. Traditional clustering algorithms deal with quantitative or categorical data points. However, there exist many important databases that store categorical data sequences, where significant knowledge is hidden behind sequential dependencies between the data. In this paper we introduce a problem of clustering categorical data sequences and present an efficient scalable algorithm to solve the problem. Our algorithm implements the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2001
2001
2015
2015

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(11 citation statements)
references
References 15 publications
0
11
0
Order By: Relevance
“…POPC algorithm proposed in [19], which starts with a set of elementary sub-clusters and merges them iteratively until the pre-defined stop condition defined in advance is satisfied, is pretty similar to the method introduced in this paper. In [19], the author introduces two variants as POPC-J using the Jaccard coefficient [16] of the clusters' contents and POPC-GA using the group average of co-occurrences of patterns describing clustering.…”
Section: Definition 15 (Session P)mentioning
confidence: 95%
See 1 more Smart Citation
“…POPC algorithm proposed in [19], which starts with a set of elementary sub-clusters and merges them iteratively until the pre-defined stop condition defined in advance is satisfied, is pretty similar to the method introduced in this paper. In [19], the author introduces two variants as POPC-J using the Jaccard coefficient [16] of the clusters' contents and POPC-GA using the group average of co-occurrences of patterns describing clustering.…”
Section: Definition 15 (Session P)mentioning
confidence: 95%
“…In [19], the author introduces two variants as POPC-J using the Jaccard coefficient [16] of the clusters' contents and POPC-GA using the group average of co-occurrences of patterns describing clustering. POPC-GA is selected to compare with the proposed algorithm because POPC-GA is much more efficient than POPC-J.…”
Section: Definition 15 (Session P)mentioning
confidence: 99%
“…Clones may also form implicit links between components that share some functionality. All this contributes towards "software aging" [36].…”
Section: The Cloning Problemmentioning
confidence: 99%
“…Morzy et al [8] assumed that sequential patterns were given and then started clustering with data that included more than one of these given sequential patterns. Hay et al [5] presented a clustering algorithm that used an edit distance method to measure the similarity between sequences, while Wang and Zaiane [9] proposed a clustering method based on a sequence alignment method to measure the similarity between sequences.…”
Section: Related Workmentioning
confidence: 99%