2021
DOI: 10.1609/aaai.v26i1.8282
|View full text |Cite
|
Sign up to set email alerts
|

Weighted Clustering

Abstract: We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights. We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…However, different algorithms may return different results on the same data, with larger or smaller intersection areas, depending on the specific application. Therefore, one must choose the most appropriate clustering algorithm for the given data set [139,140]. The critical point of the clustering algorithms lies in comparing every pair of objects in a data set, i.e., to evaluate the distance between every pair of objects.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, different algorithms may return different results on the same data, with larger or smaller intersection areas, depending on the specific application. Therefore, one must choose the most appropriate clustering algorithm for the given data set [139,140]. The critical point of the clustering algorithms lies in comparing every pair of objects in a data set, i.e., to evaluate the distance between every pair of objects.…”
Section: Methodsmentioning
confidence: 99%
“…For a p-space set of N objects, where p denotes the number of variables that define an object, there are N(N − 1)/2 distances, and each distance is a p-dimensional one. Various methods express distances in data analysis [139,141], but the most natural and widely used is the well-known Euclidean distance. The Euclidean distance between two objects is treated as p-space vectors whose elements are variables in the data set:…”
Section: Methodsmentioning
confidence: 99%
“…Several well-known algorithms originated in this field, such as single linkage [4], average linkage [5], and Ward method [6]. Nowadays, there exists a large literature on ultrametric fitting, which can be roughly divided in four categories: agglomerative and divisive greedy heuristics [7][8][9][10][11][12][13], integer linear programming [14][15][16], continuous relaxations [17][18][19][20], and probabilistic formulations [21][22][23]. Our work belongs to the family of continuous relaxations.…”
Section: Introductionmentioning
confidence: 99%