Finding Consistent Clusters in Data Partitions

Fred, Ana

doi:10.1007/3-540-48219-9_31

Cited by 224 publications

(151 citation statements)

References 17 publications

(19 reference statements)

Supporting

Mentioning

142

Contrasting

Unclassified

Order By: Relevance

“…When the single linkage method has been used for consensus clustering it has been referred to as the majority rule or the quota rule 21,41 . The DIRECT procedure in CLUTO provided the next consensus clustering method.…”

Section: Consensus Clustering Methodsmentioning

confidence: 99%

Combining multiple classifications of chemical structures using consensus clustering

Chu

Holliday

Willett

2012

Bioorganic & Medicinal Chemistry

View full text Add to dashboard Cite

Consensus clustering involves combining multiple clusterings of the same set of objects to achieve a single clustering that will, hopefully, provide a better picture of the groupings that are present in a dataset. This paper reports the use of consensus clustering methods on sets of chemical compounds represented by 2D fingerprints. Experiments with DUD, IDAlert, MDDR and MUV data suggests that consensus methods are unlikely to result in significant improvements in clustering effectiveness as compared to the use of a single clustering method.

show abstract

Section: Consensus Clustering Methodsmentioning

confidence: 99%

Combining multiple classifications of chemical structures using consensus clustering

Chu

Holliday

Willett

2012

Bioorganic & Medicinal Chemistry

View full text Add to dashboard Cite

show abstract

“…Fred and Jain [9,11,12,10] suggest to use the k-means clustering algorithm several times with random initial conditions. In each of the clustering trials, the number of clusters, k, is either fixed or chosen randomly in the range k ∈ [k min , k max ].…”

Section: Consensus Clusteringmentioning

confidence: 99%

“…Each value of k is used to accumulate evidence about the clustering structure using two different approaches: -In the first approach, what we will refer to as a consensus matrix, is computed. This entails simply counting for each k whether or not pairs of data points in the data set belong to the same basin of attraction (mode), for then to compute the average over all k. Based on the consensus matrix, a hierarchical clustering approach similar to that used in [9] and [11] is utilized in order to obtain the final clustering result. -The second approach we investigate, is based on for each k to compute an information theoretic divergence measure between pairs of modes resulting in a similarity matrix between modes, for then to average over all k. Then, a spectral clustering procedure is executed on this matrix, similar to [1].…”

Section: Introductionmentioning

confidence: 99%

Consensus Clustering Using kNN Mode Seeking

et al. 2015

View full text Add to dashboard Cite

Abstract. In this paper we present a novel clustering strategy which combines two recent strategies, consensus clustering and two stage clustering as represented by the mean shift spectral clustering algorithm. We introduce the kNN mode seeking algorithm in the consensus clustering framework, and the information theoretic kNN Cauchy Schwarz divergence as foundation for spectral clustering. In combining these frameworks, two well known issues are directly bypassed; the kernel bandwidth choice of the kernel density based mean shift and the computational complexity of the mean shift iterations. We demonstrate experiments on both real and synthetic data as a proof of concept for our contributions.

show abstract

“…A popular technique for merging is called 'majority voting' [11,12] which is a pair-counting method extended over multiple clusterings. Using a co-association matrix of data points, where pairs of points are given a score if they appear in the same cluster over all available clusterings.…”

Section: Related Workmentioning

confidence: 99%

Clustering Similarity Comparison Using Density Profiles

Bae

Bailey

Dong³

2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. The unsupervised nature of cluster analysis means that objects can be clustered in many different ways. This means that different clustering algorithms can lead to vastly different results. To address this, clustering similarity comparison methods have traditionally been used to quantify the degree of similarity between alternative clusterings. However, existing techniques utilize only the point-to-cluster memberships to calculate the similarity, which can lead to unintuitive results. They also can't be applied to analyze clusterings which only partially share points, which can be the case in stream clustering. In this paper we introduce a new measure named ADCO, which takes into account density profiles for each attribute and aims to address these problems.We provide experiments to demonstrate this new measure can often provide a more reasonable similarity comparison between different clusterings than existing methods.

show abstract

Finding Consistent Clusters in Data Partitions

Cited by 224 publications

References 17 publications

Combining multiple classifications of chemical structures using consensus clustering

Combining multiple classifications of chemical structures using consensus clustering

Consensus Clustering Using kNN Mode Seeking

Clustering Similarity Comparison Using Density Profiles

Contact Info

Product

Resources

About