2017
DOI: 10.1137/16m1105396
|View full text |Cite
|
Sign up to set email alerts
|

Faster Kernel Ridge Regression Using Sketching and Preconditioning

Abstract: Kernel Ridge Regression is a simple yet powerful technique for non-parametric regression whose computation amounts to solving a linear system. This system is usually dense and highly ill-conditioned. In addition, the dimensions of the matrix are the same as the number of data points, so direct methods are unrealistic for large-scale datasets. In this paper, we propose a preconditioning technique for accelerating the solution of the aforementioned linear system. The preconditioner is based on random feature map… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
86
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 79 publications
(87 citation statements)
references
References 17 publications
1
86
0
Order By: Relevance
“…As long as M is sufficiently smaller than n, this leads to more scalable solutions, e.g., for regression we get back to O(nM 2 ) training and O(M d) prediction time, with O(nM ) memory requirements. We can also apply random features to unsupervised learning problems, such as kernel clustering [Chitta et al, 2012].…”
Section: Random Features (Rf)mentioning
confidence: 99%
See 4 more Smart Citations
“…As long as M is sufficiently smaller than n, this leads to more scalable solutions, e.g., for regression we get back to O(nM 2 ) training and O(M d) prediction time, with O(nM ) memory requirements. We can also apply random features to unsupervised learning problems, such as kernel clustering [Chitta et al, 2012].…”
Section: Random Features (Rf)mentioning
confidence: 99%
“…An important question is how to construct different weights on the integrand functions to have a better kernel approximation. Given a fixed QMC sequence, Avron et al [2016b] solve the weights by optimizing a derived error bound based on the observed data. There are two potential weakness of [Avron et al, 2016b].…”
Section: Bayesian Quadrature For Random Featuresmentioning
confidence: 99%
See 3 more Smart Citations