Avoiding Bias in Text Clustering Using Constrained K-means and May-Not-Links

Ares, M. Eduardo; Parapar, Javier; Barreiro, Álvaro

doi:10.1007/978-3-642-04417-5_32

Cited by 6 publications

(13 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Firstly, we survey normalised cut, a very effective spectral clustering algorithm introduced by Shi and Malik in [5], and its constrained counterpart, constrained normalised cut, introduced by Ji et al in [4]. Afterwards, we outline soft constrained k-means, a constrained clustering algorithm based on k-means introduced by Ares et al in [3].…”

Section: Clustering Algorithmsmentioning

confidence: 99%

“…In order to address these limitations, Ares et al introduced in [3] two kinds of non absolute constraints: May-Links and May-Not-Links, which indicate that two documents are, respectively, likely or not likely to be in the same cluster. The implementation of these constraints alters again the assignment process of the documents.…”

Section: Soft Constrained K-meansmentioning

confidence: 99%

“…If we are trying to avoid the tendency (bias) of the clustering algorithm to fall in a certain grouping of the data that is being clustered the task is called Avoiding Bias. This problem has been tackled by several authors in the last years, which have proposed a wide range of approaches, ranging from distance learning [2] to using constraints [3]. However, it should be underlined that avoiding bias is still a clustering process, where the main focus is providing the user with a meaningful grouping of the data.…”

Section: Introductionmentioning

confidence: 99%

“…Concretely, we test two different approaches which use a strategy similar to the one in [3] (using negative constraints to steer the clustering process away from the known clustering), making use of spectral clustering techniques to try to attain that high quality. The first one is introducing negative constraints in the constrained normalised clustering approach proposed by Ji et al in [4].…”

Section: Introductionmentioning

confidence: 99%

“…The first one is introducing negative constraints in the constrained normalised clustering approach proposed by Ji et al in [4]. The second one is introducing the soft constrained k-means algorithm proposed by Ares et al in [3], which has been shown to have good results, in the second phase of a normalised cut clustering algorithm [5]. The experiments carried out with these approaches showed that, while the first approach does not yield good results, the combined one (normalised cut plus soft constrained k Means) outperforms soft constrained k-means in terms of quality of the results while keeping a good avoidance of the known clustering.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Improving Alternative Text Clustering Quality in the Avoiding Bias Task with Spectral and Flat Partition Algorithms

Ares

Parapar

Barreiro

2010

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. The problems of finding alternative clusterings and avoiding bias have gained popularity over the last years. In this paper we put the focus on the quality of these alternative clusterings, proposing two approaches based in the use of negative constraints in conjunction with spectral clustering techniques. The first approach tries to introduce these constraints in the core of the constrained normalised cut clustering, while the second one combines spectral clustering and soft constrained k-means. The experiments performed in textual collections showed that the first method does not yield good results, whereas the second one attains large increments on the quality of the results of the clustering while keeping low similarity with the avoided grouping.

show abstract

Section: Clustering Algorithmsmentioning

confidence: 99%

Section: Soft Constrained K-meansmentioning

confidence: 99%