Wai Ming Tai scite author profile

Wai Ming Tai

5Publications

64Citation Statements Received

149Citation Statements Given

How they've been cited

100

How they cite others

142

Affiliations

University of Chicago, University of Utah

Publications

Order By: Most citations

Near-Optimal Coresets of Kernel Density Estimates

Phillips

Tai

2019

Discrete Comput Geom

View full text Add to dashboard Cite

We construct near-optimal coresets for kernel density estimates for points in R d when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size O( √ d/ε · log 1/ε), and we show a near-matching lower bound of size Ω(min{ √ d/ε, 1/ε 2 }). When d ≥ 1/ε 2 , it is known that the size of coreset can be O(1/ε 2 ). The upper bound is a polynomial-in-(1/ε) improvement when d ∈ [3, 1/ε 2 ) and the lower bound is the first known lower bound to depend on d for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.

show abstract

Dovitinib demonstrates antitumor and antimetastatic activities in xenograft models of hepatocellular carcinoma

Huynh¹,

Chow²,

Tai³

et al. 2012

Journal of Hepatology

View full text Add to dashboard Cite

Improved Coresets for Kernel Density Estimates

Phillips

Tai

2018

View full text Add to dashboard Cite

We study the construction of coresets for kernel density estimates. That is we show how to approximate the kernel density estimate described by a large point set with another kernel density estimate with a much smaller point set. For characteristic kernels (including Gaussian and Laplace kernels), our approximation preserves the L ∞ error between kernel density estimates within error ε, with coreset size 4/ε 2 , but no other aspects of the data, including the dimension, the diameter of the point set, or the bandwidth of the kernel common to other approximations. When the dimension is unrestricted, we show this bound is tight for these kernels as well as a much broader set.This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called kernel herding. This analysis unites a broad line of work that spans statistics, machine learning, and geometry.When the dimension d is constant, we demonstrate much tighter bounds on the size of the coreset specifically for Gaussian kernels, showing that it is bounded by the size of the coreset for axis-aligned rectangles. Currently the best known constructive bound is O( 1 ε log d 1 ε ), and non-constructively, this can be improved by log 1 ε . This improves the best constant dimension bounds polynomially for d ≥ 3.

show abstract

Tracking the Frequency Moments at All Times

Huang¹,

Tai²,

Yi³

2014

Preprint

View full text Add to dashboard Cite

The traditional requirement for a randomized streaming algorithm is just one-shot, i.e., algorithm should be correct (within the stated ε-error bound) at the end of the stream. In this paper, we study the tracking problem, where the output should be correct at all times. The standard approach for solving the tracking problem is to run O(log m) independent instances of the one-shot algorithm and apply the union bound to all m time instances. In this paper, we study if this standard approach can be improved, for the classical frequency moment problem. We show that for the F p problem for any 1 < p ≤ 2, we actually only need O(log log m + log n) copies to achieve the tracking guarantee in the cash register model, where n is the universe size. Meanwhile, we present a lower bound of Ω(log m log log m) bits for all linear sketches achieving this guarantee. This shows that our upper bound is tight when n = (log m) O(1) . We also present an Ω(log 2 m) lower bound in the turnstile model, showing that the standard approach by using the union bound is essentially optimal.

show abstract

The GaussianSketch for Almost Relative Error Kernel Distance

Phillips

Tai

2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wai Ming Tai

Near-Optimal Coresets of Kernel Density Estimates

Dovitinib demonstrates antitumor and antimetastatic activities in xenograft models of hepatocellular carcinoma

Improved Coresets for Kernel Density Estimates

Tracking the Frequency Moments at All Times

The GaussianSketch for Almost Relative Error Kernel Distance

Contact Info

Product

Resources

About