2007
DOI: 10.1016/j.patcog.2006.06.026
|View full text |Cite
|
Sign up to set email alerts
|

Model-based evaluation of clustering validation measures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
132
1
14

Year Published

2007
2007
2020
2020

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 193 publications
(153 citation statements)
references
References 15 publications
6
132
1
14
Order By: Relevance
“…We agree with Dougherty and Brun [19,15] that "validation" of clustering results is a heuristic process, even though there are some interesting efforts to objectively incorporate biological knowledge in this process using Gene Ontology, especially when one is clustering gene expression profiles [17,23]. However, to illustrate the usefulness of our software, we collected several examples in which the performance of Simcluster can be considered as qualitatively superior to some traditional approaches imported from the microarray analysis field.…”
Section: Resultssupporting
confidence: 65%
“…We agree with Dougherty and Brun [19,15] that "validation" of clustering results is a heuristic process, even though there are some interesting efforts to objectively incorporate biological knowledge in this process using Gene Ontology, especially when one is clustering gene expression profiles [17,23]. However, to illustrate the usefulness of our software, we collected several examples in which the performance of Simcluster can be considered as qualitatively superior to some traditional approaches imported from the microarray analysis field.…”
Section: Resultssupporting
confidence: 65%
“…62,63 Perhaps the oxide electronics challenges are rather better described as "small" and "deep" functionality-targeted data problems, where the necessary prior knowledge required to make progress still remains to be uncovered from hypothesis-and exploration-driven scientific discovery methodologies and personal communications between theorists and experimentalists. The inability of informatics-based data analyses alone to create robust, i.e., optimal model design and error estimation, mechanistic knowledge from small data sets, typical of what we can expect in oxides, has been borne out in biology and genetics; [64][65][66] similar pitfalls should be avoidable in materials science. The field should establish to what extent statistical inference and learning methods can benefit materials discovery by including authentic structure-property relationships guided by physical models.…”
Section: -3mentioning
confidence: 99%
“…Clustering validity measures fall broadly into three classes [13]: a) internal validation (based on properties of the resulting clusters), b) relative validation (running the algorithm with different parameters), c) external validation (comparison with a given partition of the data).…”
Section: Evaluation Of Clustering Resultsmentioning
confidence: 99%
“…Following the recommendation proposed in [13], we evaluate the clustering results based on silhouette values that seem to be a good internal validation measure and also provide good graphical representation of clustering quality. The silhouettes validation technique [14] calculates the silhouette width for each sample, the average silhouette width for each cluster and the overall average silhouette width for a total data set.…”
Section: Evaluation Of Clustering Resultsmentioning
confidence: 99%