Scalable sparse approximation of a sample mean

Cortés, Efrén Cruz; Scott, Clayton

doi:10.1109/icassp.2014.6854602

Cited by 7 publications

(6 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The former has been studied extensively in the literature. For example, Cruz Cortés and Scott (2014) considers the problem of approximating the kernel mean as a sparse linear combination of the sample.…”

Section: Approximating the Kernel Meanmentioning

confidence: 99%

See 1 more Smart Citation

Kernel Mean Embedding of Distributions: A Review and Beyond

et al. 2017

View full text Add to dashboard Cite

A Hilbert space embedding of a distribution-in short, a kernel mean embedding-has recently emerged as a powerful tool for machine learning and inference. The basic idea behind this framework is to map distributions into a reproducing kernel Hilbert space (RKHS) in which the whole arsenal of kernel methods can be extended to probability measures. It can be viewed as a generalization of the original "feature map" common to support vector machines (SVMs) and other kernel methods. While initially closely associated with the latter, it has meanwhile found application in fields ranging from kernel machines and probabilistic modeling to statistical inference, causal discovery, and deep learning.The goal of this survey is to give a comprehensive review of existing work and recent advances in this research area, and to discuss some of the most challenging issues and open problems that could potentially lead to new research directions. The survey begins with a brief introduction to the RKHS and positive definite kernels which forms the backbone of this survey, followed by a thorough discussion of the Hilbert space embedding of marginal distributions, theoretical guarantees, and a review of its applications. The embedding of distributions enables us to apply RKHS methods to probability measures which prompts a wide range of applications such as kernel two-sample testing, independent testing, group anomaly detection, and learning on distributional data. Next, we discuss the Hilbert space embedding for conditional distributions, give theoretical insights, and review some applications. The conditional mean embedding enables us to perform sum, product, and Bayes' rules-which are ubiquitous in graphical model, probabilistic inference, and reinforcement learning-in a non-parametric way using the new representation of distributions in RKHS. We then discuss relationships between this framework and other related areas. Lastly, we give some suggestions on future research directions. The targeted audience includes graduate students and researchers in machine learning and statistics who are interested in the theory and applications of kernel mean embeddings.G over some input space Y, we havewhere U Y |x denotes the embedding of the conditional distribution P(Y |X = x). That is, we can compute a conditional expected value of any function g ∈ G w.r.t. P(Y |X = x) by taking an inner product in G between the function g and the embedding of P(Y |X = x) (see Section 4A Synopsis. As a result of the aforementioned advantages, the kernel mean embedding has made widespread contributions in various directions. Firstly, most tasks in machine learning and statistics involve estimation of the data-generating process whose success depends critically on the accuracy and the reliability of this estimation. It is known that estimating the kernel mean embedding is easier than estimating the distribution itself, which helps improve many statistical inference methods. These include, for example, two-sample testing (

show abstract

Section: Approximating the Kernel Meanmentioning

confidence: 99%

“…The advances along this direction will benefit the development of algorithms using kernel mean embedding. In the context of MMD, many recent studies have addressed this issue (Zaremba et al 2013, Cruz Cortés and Scott 2014, Ji Zhao 2015, Chwialkowski et al 2015.…”

Section: Scalabilitymentioning

confidence: 99%

Kernel Mean Embedding of Distributions: A Review and Beyond

et al. 2017

View full text Add to dashboard Cite

show abstract

“…. , x n , a kernel φ, and a target sparsity k, we seek a sparse kernel mean (2) that accurately approximates the kernel mean (1). This problem is motivated by applications where n is so large that evaluation or manipulation of the full kernel mean is computationally prohibitive.…”

Section: Introductionmentioning

confidence: 99%

“…Finally, Section 7 applies our methodology in three different machine learning problems that rely on large-scale KDEs and KMEs, and demonstrates the efficacy of our approach. A preliminary version of this work appeared in [1]. A Matlab implementation of our algorithm is available at [2].…”

Section: Introductionmentioning

confidence: 99%

Sparse Approximation of a Kernel Mean

Cortés

Scott

2017

IEEE Trans. Signal Process.

Self Cite

View full text Add to dashboard Cite

Kernel means are frequently used to represent probability distributions in machine learning problems. In particular, the well known kernel density estimator and the kernel mean embedding both have the form of a kernel mean. Unfortunately, kernel means are faced with scalability issues. A single point evaluation of the kernel density estimator, for example, requires a computation time linear in the training sample size. To address this challenge, we present a method to efficiently construct a sparse approximation of a kernel mean. We do so by first establishing an incoherence-based bound on the approximation error, and then noticing that, for the case of radial kernels, the bound can be minimized by solving the k-center problem. The outcome is a linear time construction of a sparse kernel mean, which also lends itself naturally to an automatic sparsity selection scheme. We show the computational gains of our method by looking at three problems involving kernel means: Euclidean embedding of distributions, class proportion estimation, and clustering using the mean-shift algorithm.

show abstract

“…The first category tries to find a smaller subset of samples which approximate well to the original samples. For instance, a sparse linear combination of samples can approximate the kernel mean [Cortes and Scott, 2014]. Sparsity-inducing norm can also be imposed on coefficients of kernel mean [Muandet et al, 2014].…”

Section: Approximating the Kernel Mean Embeddingmentioning

confidence: 99%

Sensor-based activity recognition via learning from distributions

Qian

Pan

Chen

View full text Add to dashboard Cite

The contributions of the co-authors are as follows:• Prof. Pan and Prof. Miao provided the initial research direction.• I wrote the manuscript draft. The draft was revised by Prof. Pan.• I co-designed the experimental study with Prof. Pan, and performed all the laboratory work at the School of Computer Science and Engineering and LILY Lab. I also analyzed the data and experimental results.• I developed and released the code.

show abstract

Scalable sparse approximation of a sample mean

Cited by 7 publications

References 19 publications

Kernel Mean Embedding of Distributions: A Review and Beyond

Kernel Mean Embedding of Distributions: A Review and Beyond

Sparse Approximation of a Kernel Mean

Sensor-based activity recognition via learning from distributions

Contact Info

Product

Resources

About