The 2003 Congress on Evolutionary Computation, 2003. CEC '03.
DOI: 10.1109/cec.2003.1299776
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised hierarchical clustering via a genetic algorithm

Abstract: We present a clustering algorithm which is unsupervised, incremental, and hierarchical. The algorithm is distance-based and creates centroids. Then we combine the power of evolutionary forces with the clustering algorithm, counting on good clusterings to evolve to yet better ones. We apply our approach to standard data sets, and get very good results. Finally, we use bagging to pool the results of different clustering trials, and again get very good results.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0
11

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(26 citation statements)
references
References 21 publications
0
15
0
11
Order By: Relevance
“…Furthermore, the computational complexity of this algorithm grows exponentially for large data sets, as is the case of DNA microarray data. The method of Greene (2003) has the drawback of having a nominal scale (non-real values) on both, data set and proximity matrix, which restricts its use. Moreover, the approach of centroid computation for hierarchical clustering can imply a time overload.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Furthermore, the computational complexity of this algorithm grows exponentially for large data sets, as is the case of DNA microarray data. The method of Greene (2003) has the drawback of having a nominal scale (non-real values) on both, data set and proximity matrix, which restricts its use. Moreover, the approach of centroid computation for hierarchical clustering can imply a time overload.…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, both works (Lozano and Larrañaga, 1999) and (Greene, 2003) carried out for other data contexts very different of gene expression data, and neither of them, introduce heuristics to reduce the complexity of the search space. According to latter, the first experiments carried out on HCGA without including constraints to the search space (as made in Lozano and Larrañaga (1999) and Greene (2003)), showed that HCGA is intractable according to runtime when the data set is large. This suggests that the GA-based search for an optimum dendrogram without including constraints still remains intractable, in general.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In this work, we have used GAs as a function optimization problem to evolve structures representing sets of acoustic segments rules as used in Hansohm (2000), Aissiou and Guerti (2004), Greene (2003).…”
Section: Introductionmentioning
confidence: 99%
“…Greene [12] proposes a method that generates hierarchies of partitions. It begins with a top-down method by which the initial population is subdivided into several subpopulations.…”
Section: B Genetic Operatorsmentioning
confidence: 99%