2010 IEEE 51st Annual Symposium on Foundations of Computer Science 2010
DOI: 10.1109/focs.2010.38
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Volume Sampling for Row/Column Subset Selection

Abstract: We give efficient algorithms for volume sampling, i.e., for picking k-subsets of the rows of any given matrix with probabilities proportional to the squared volumes of the simplices defined by them and the origin (or the squared volumes of the parallelepipeds defined by these subsets of rows). This solves an open problem from the monograph on spectral algorithms by Kannan and Vempala (see Section 7.4 of [15], also implicit in [1,5]).Our first algorithm for volume sampling k-subsets of rows from an m-by-n matri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

2
217
0
1

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 158 publications
(224 citation statements)
references
References 19 publications
2
217
0
1
Order By: Relevance
“…We show that O(k/ǫ) columns contain a rank-k subspace which reconstructs A to relative error, and we present the first sub-SVD (in terms of running time) randomized algorithm to identify these columns. This matches the Ω(k/ǫ) lower bound in [8] and improves the best known upper bound of O(k log k + k/ǫ) [6,8,12,23]. …”
supporting
confidence: 63%
See 2 more Smart Citations
“…We show that O(k/ǫ) columns contain a rank-k subspace which reconstructs A to relative error, and we present the first sub-SVD (in terms of running time) randomized algorithm to identify these columns. This matches the Ω(k/ǫ) lower bound in [8] and improves the best known upper bound of O(k log k + k/ǫ) [6,8,12,23]. …”
supporting
confidence: 63%
“…There is considerable interest (e.g. [4,6,8,9,11,15,19,20,21]) in determining a minimum set of r ≪ n columns of A which is approximately as good as A k at reconstructing A. Such columns are important for interpreting data [21], building robust machine learning algorithms [4], feature selection, etc.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, approximate algorithms with lower computational complexity have been presented in the relevant literature, with the goal of finding a suboptimal but acceptable solution. The proposed approaches include randomized, deterministic or hybrid methods, using SVD sparse approximation [16], random selection of matrix columns, based on a probability distribution, which is later refined deterministically [10], greedy recursive computation of the reconstruction error, initialized with random projections of the matrix columns [17], column subset selection with probabilities proportional to the squared volumes of the parallelepipeds defined by these subsets [18], etc.…”
Section: Multimodal Shot Pruning (Msp)mentioning
confidence: 99%
“…These include the problem of sampling from a DPP, computing its partition function and the MAP-inference problem which asks to find the set of highest probability (or equivalently to find the largest coefficient of q(x)). For the case of unconstrained DPPs problems (1) and (2) are quite well understood, and various solutions have been proposed [Kha95,DR10,AGR16,Nik15,SEFM15]. Recently, the case of constrained DPPs -when the support is restricted to a combinatorial family B ⊆ 2 [m] -has been studied [NS16,SV16,CDKV16] with machine learning applications in mind, however, very little is known computationally.…”
Section: Introductionmentioning
confidence: 99%