Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing 2021
DOI: 10.1145/3406325.3451022
|View full text |Cite
|
Sign up to set email alerts
|

A new coreset framework for clustering

Abstract: In all state-of-the-art sketching and coreset techniques for clustering, as well as in the best known fixed-parameter tractable approximation algorithms, randomness plays a key role. For the classic k-median and k-means problems, there are no known deterministic dimensionality reduction procedure or coreset construction that avoid an exponential dependency on the input dimension d, the precision parameter ε −1 or k. Furthermore, there is no coreset construction that succeeds with probability 1 − 1/n and whose … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
49
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 30 publications
(52 citation statements)
references
References 54 publications
3
49
0
Order By: Relevance
“…Our result improves over the Ω(kε −1 log n) lower bound of Baker, Braverman, Huang, Jiang, Krauthgamer, and Wu [6]. For the k-median and k-means objective matches the upper bounds proposed in Feldman and Langberg [41] and Cohen-Addad, Saulpic, and Schwiegelshohn [35] up to polylog(1/ε) factors.…”
Section: Best Lower Boundsupporting
confidence: 69%
See 3 more Smart Citations
“…Our result improves over the Ω(kε −1 log n) lower bound of Baker, Braverman, Huang, Jiang, Krauthgamer, and Wu [6]. For the k-median and k-means objective matches the upper bounds proposed in Feldman and Langberg [41] and Cohen-Addad, Saulpic, and Schwiegelshohn [35] up to polylog(1/ε) factors.…”
Section: Best Lower Boundsupporting
confidence: 69%
“…For any ε, k, D such that D ≥ 5 log 1/ε and log k = O(D), there exists a graph with doubling dimension D on which any (ε, k, z)-coreset using offset ∆ must have size Ω kD ε 2 . This matches up to polylog(1/ε) factors the upper bound from [35] for k-median and k-means.…”
Section: Best Lower Boundsupporting
confidence: 51%
See 2 more Smart Citations
“…In Euclidean space, recent work showed that coresets for k-MEANS and k-MEDIAN clustering can have size that is independent of the Euclidean dimension [FSS20, SW18, HV20]. Beyond Euclidean space, coresets of size independent of the data-set size were constructed also for many important metric spaces [HJLW18,BJKW21,CASS21]. A more comprehensive overview can be found in recent surveys [Phi17,Fel20].…”
Section: Additional Related Workmentioning
confidence: 99%