2008
DOI: 10.1002/cem.1189
|View full text |Cite
|
Sign up to set email alerts
|

Use of cluster separation indices and the influence of outliers: application of two new separation indices, the modified silhouette index and the overlap coefficient to simulated data and mouse urine metabolomic profiles

Abstract: To quantify separate classes, four indices are compared namely the Davies Bouldin index, the silhouette width and two new approaches described in this paper, the modified silhouette width index based on the proportion of objects with a positive silhouette width and the Overlap Coefficient. Four sets of simulated datasets are described, each in turn, consisting of 15 sets of data of varying degrees of overlap, and differing in the nature of outliers. Three experimental datasets consisting of the gas chromatogra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2009
2009
2021
2021

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 23 publications
(12 citation statements)
references
References 24 publications
(27 reference statements)
0
10
0
Order By: Relevance
“…Simply, a visual inspection of the clustering pattern or class separation in a scores plot is not typically sufficient to infer statistical relevance. Methods using cluster overlap metrics [99], statistical distances [98], and hierarchical clustering [100, 101] have been successfully used to quantify separations in scores plots. Also, class membership may be inferred from 95% confidence ellipses calculated from scores [101].…”
Section: Methods Selectionmentioning
confidence: 99%
“…Simply, a visual inspection of the clustering pattern or class separation in a scores plot is not typically sufficient to infer statistical relevance. Methods using cluster overlap metrics [99], statistical distances [98], and hierarchical clustering [100, 101] have been successfully used to quantify separations in scores plots. Also, class membership may be inferred from 95% confidence ellipses calculated from scores [101].…”
Section: Methods Selectionmentioning
confidence: 99%
“…The Davies-Bouldin index leaves the power of the norms for the distance between the class centres ( m 1 À m 2 k k p ) and the dispersion (here S w ) as parameters, and can therefore be seen as a generalization of the Fisher criterion. The original use of the Davies-Bouldin index was for the problem of finding an optimal subdivision of data points into classes when true class membership is unknown, which is also known as the clustering problem [14,15]. Davies and Bouldin define five properties that should hold for a cluster similarity measure.…”
Section: Relation To the Fisher Criterionmentioning
confidence: 99%
“…The biological datasets used in the analysis have been described in more detail elsewhere (Dixon et al 2009) and a summary of sample numbers used in this paper is presented in Table 3. For the diet study (dataset 8), urine was sampled from 10 mice on a high fat diet and 9 controls at 10-12 weeks of age.…”
Section: Datasets and Sample Collectionmentioning
confidence: 99%