We construct near-optimal coresets for kernel density estimates for points in R d when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size O( √ d/ε · log 1/ε), and we show a near-matching lower bound of size Ω(min{ √ d/ε, 1/ε 2 }). When d ≥ 1/ε 2 , it is known that the size of coreset can be O(1/ε 2 ). The upper bound is a polynomial-in-(1/ε) improvement when d ∈ [3, 1/ε 2 ) and the lower bound is the first known lower bound to depend on d for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.
We study the construction of coresets for kernel density estimates. That is we show how to approximate the kernel density estimate described by a large point set with another kernel density estimate with a much smaller point set. For characteristic kernels (including Gaussian and Laplace kernels), our approximation preserves the L ∞ error between kernel density estimates within error ε, with coreset size 4/ε 2 , but no other aspects of the data, including the dimension, the diameter of the point set, or the bandwidth of the kernel common to other approximations. When the dimension is unrestricted, we show this bound is tight for these kernels as well as a much broader set.This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called kernel herding. This analysis unites a broad line of work that spans statistics, machine learning, and geometry.When the dimension d is constant, we demonstrate much tighter bounds on the size of the coreset specifically for Gaussian kernels, showing that it is bounded by the size of the coreset for axis-aligned rectangles. Currently the best known constructive bound is O( 1 ε log d 1 ε ), and non-constructively, this can be improved by log 1 ε . This improves the best constant dimension bounds polynomially for d ≥ 3.
The traditional requirement for a randomized streaming algorithm is just one-shot, i.e., algorithm should be correct (within the stated ε-error bound) at the end of the stream. In this paper, we study the tracking problem, where the output should be correct at all times. The standard approach for solving the tracking problem is to run O(log m) independent instances of the one-shot algorithm and apply the union bound to all m time instances. In this paper, we study if this standard approach can be improved, for the classical frequency moment problem. We show that for the F p problem for any 1 < p ≤ 2, we actually only need O(log log m + log n) copies to achieve the tracking guarantee in the cash register model, where n is the universe size. Meanwhile, we present a lower bound of Ω(log m log log m) bits for all linear sketches achieving this guarantee. This shows that our upper bound is tight when n = (log m) O(1) . We also present an Ω(log 2 m) lower bound in the turnstile model, showing that the standard approach by using the union bound is essentially optimal.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.