2012
DOI: 10.4304/jetwi.4.1.51-59
|View full text |Cite
|
Sign up to set email alerts
|

Careful Seeding Method based on Independent Components Analysis for k-means Clustering

Abstract: The k-means clustering method is a widely used clustering technique for the Web because of its simplicity and speed. However, the clustering result depends heavily on the chosen initial clustering centers, which are uniformly chosen at random from the data points. We propose a seeding method that is based on the independent component analysis for the k-means clustering method. We evaluate the performance of our proposed method and compare it with other seeding methods by using benchmark datasets. We also appli… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0
1

Year Published

2013
2013
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(9 citation statements)
references
References 18 publications
0
8
0
1
Order By: Relevance
“…However, the KKZ method sometimes finds bad clusters because unfortunately it depends on outlier data points [8]. This method has one obvious pitfall.…”
Section: Fig 3: Kkz Methodsmentioning
confidence: 99%
“…However, the KKZ method sometimes finds bad clusters because unfortunately it depends on outlier data points [8]. This method has one obvious pitfall.…”
Section: Fig 3: Kkz Methodsmentioning
confidence: 99%
“…First part in the first iteration is initialization of kseeds for the k-means algorithm, and initialization of k-weights for each data point. We use k-means++ [44], [49] algorithm to obtain the initial seeds of the k-means clustering, while the kweights for each data point are initialized to zero. The second part is updating the k-weights [22].…”
Section: B Clustering Stepmentioning
confidence: 99%
“…Table 1 gives the data set descriptions. For each data set, the number of clusters (K) was set equal to the number of classes (K ′ ), as commonly seen in the related literature [53,7,100,94,2,21,64,86,24,25,41].…”
Section: Data Set Descriptionsmentioning
confidence: 99%