2004
DOI: 10.1590/s1415-47572004000400025
|View full text |Cite
|
Sign up to set email alerts
|

Comparative analysis of clustering methods for gene expression time course data

Abstract: This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series). Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
35
0
4

Year Published

2005
2005
2018
2018

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 61 publications
(40 citation statements)
references
References 20 publications
1
35
0
4
Order By: Relevance
“…The decision of the best number of clusters to perform the GK algorithm, is based on the best overall VAF. Once the decision of the number of clusters is made, and a dynamic formula is obtained, next step is to test the efficiency of the approach, through the cross-validation, so call k-fold method [6]. The main objective of this method is to verify the model behavior on multiple data groups, to be used for training and testing.…”
Section: Results Of the Fuzzy Identification Methods For The Hydrogen mentioning
confidence: 99%
“…The decision of the best number of clusters to perform the GK algorithm, is based on the best overall VAF. Once the decision of the number of clusters is made, and a dynamic formula is obtained, next step is to test the efficiency of the approach, through the cross-validation, so call k-fold method [6]. The main objective of this method is to verify the model behavior on multiple data groups, to be used for training and testing.…”
Section: Results Of the Fuzzy Identification Methods For The Hydrogen mentioning
confidence: 99%
“…In the case of the average method, the distance between two clusters is calculated by the average distance between the patterns in one group and the patterns in the other group. Such a method has been extensively used in the literature of gene expression analysis [2], [3], [26], [27], although experimental results have shown that in many cases the complete linkage outperforms it [4].…”
Section: B Clustering Methods and Recovery Measurementioning
confidence: 99%
“…The average linkage and complete linkage methods had cR values as high as k-means for only one dataset, "Nutt-V3". While the main objective of our study is not the comparison of the clustering methods themselves, the shortcoming of hierarchical methods is noticeable in other comparative studies on gene expressions data [26], [27], [4]. However, hierarchical methods are still widely used in clustering gene expression datasets.…”
Section: Final Remarksmentioning
confidence: 99%
“…A análise de dados de expressão gênica identificada ao longo do tempo -"microarray time series" (MTS) -tem possibilitado o entendimento de diversos processos biológicos (Mukhopadhyay & Chatterjee, 2007), uma vez que o conhecimento de grupos de genes que se expressam de forma similar possibilita inferir a respeito de funções e mecanismos reguladores desses genes (Costa et al, 2004).…”
Section: Introductionunclassified