Improved Practical Matrix Sketching with Guarantees

Desai, Amey; Ghashami, Mina; Phillips, Jeff M.

doi:10.1109/tkde.2016.2539943

Cited by 18 publications

(21 citation statements)

References 63 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Fiber subset selection, also called tensor cross approximation (TCA), finds a small subset of fibers which approximates the entire data tensor. For the matrix case, this problem is known as the Column/Row Subset Selection or CUR Problem which has been thoroughly investigated and for which there exist several algorithms with almost matching lower bounds [64,82,140].…”

Section: Tensor Sketching Using Tucker Modelmentioning

confidence: 99%

Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions

Cichocki

Lee

Oseledets

et al. 2016

FNT in Machine Learning

373

342

View full text Add to dashboard Cite

Modern applications in engineering and data science are increasingly based on multidimensional data of exceedingly high volume, variety, and structural richness. However, standard machine learning algorithms typically scale exponentially with data volume and complexity of cross-modal couplings - the so called curse of dimensionality - which is prohibitive to the analysis of large-scale, multi-modal and multi-relational datasets. Given that such data are often efficiently represented as multiway arrays or tensors, it is therefore timely and valuable for the multidisciplinary machine learning and data analytic communities to review low-rank tensor decompositions and tensor networks as emerging tools for dimensionality reduction and large scale optimization problems. Our particular emphasis is on elucidating that, by virtue of the underlying low-rank approximations, tensor networks have the ability to alleviate the curse of dimensionality in a number of applied areas. In Part 1 of this monograph we provide innovative solutions to low-rank tensor network decompositions and easy to interpret graphical representations of the mathematical operations on tensor networks. Such a conceptual insight allows for seamless migration of ideas from the flat-view matrices to tensor network operations and vice versa, and provides a platform for further developments, practical applications, and non-Euclidean extensions. It also permits the introduction of various tensor network operations without an explicit notion of mathematical expressions, which may be beneficial for many research communities that do not directly rely on multilinear algebra. Our focus is on the Tucker and tensor train (TT) decompositions and their extensions, and on demonstrating the ability of tensor networks to provide linearly or even super-linearly (e.g., logarithmically) scalable solutions, as illustrated in detail in Part 2 of this monograph

show abstract

Section: Tensor Sketching Using Tucker Modelmentioning

confidence: 99%

Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions

Cichocki

Lee

Oseledets

et al. 2016

FNT in Machine Learning

373

342

View full text Add to dashboard Cite

show abstract

“…Another similar line of work is the CUR factorization [4,10,12,14,27] where methods select c columns and r rows of A to form matrices C ∈ R n×c , R ∈ R r×d and U ∈ R c×r , and constructs the sketch as B = CU R. The only instance of this group that runs in input sparsity time is [4] Random projection techniques These techniques [31,36,35,26] operate data-obliviously and maintain a r×d matrix B = SA using a r×n random matrix S which has the Johnson-Lindenstrauss Transform (JLT) property [28]. Random projection methods work in the streaming model, are computationally efficient, and sufficiently accurate in practice [7]. The state-of-the-art method of this approach is by Clarkson and Woodruff [6] which was later improved slightly in [30].…”

Section: Matrix Sketching Prior Artmentioning

confidence: 99%

“…Examples of these methods include different version of iterative SVD [19,21,23,5,33]. These, however, do not have theoretical guarantees [7]. The FrequentDirections algorithm [24] is a unique in this group in that it offers strong error guarantees.…”

Section: Matrix Sketching Prior Artmentioning

confidence: 99%

Efficient Frequent Directions Algorithm for Sparse Matrices

Ghashami

Liberty

Phillips

2016

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Self Cite

View full text Add to dashboard Cite

This paper describes Sparse Frequent Directions, a variant of Frequent Directions for sketching sparse matrices. It resembles the original algorithm in many ways: both receive the rows of an input matrix A n×d one by one in the streaming setting and compute a small sketch B ∈ R ×d . Both share the same strong (provably optimal) asymptotic guarantees with respect to the spaceaccuracy tradeoff in the streaming setting. However, unlike Frequent Directions which runs in O(nd ) time regardless of the sparsity of the input matrix A, Sparse Frequent Directions runs iñ O nnz(A) + n 2 time. Our analysis loosens the dependence on computing the Singular Value Decomposition (SVD) as a black box within the Frequent Directions algorithm. Our bounds require recent results on the properties of fast approximate SVD computations. Finally, we empirically demonstrate that these asymptotic improvements are practical and significant on real and synthetic data.

show abstract

“…This is achieved by using a "forgetting factor" in Step b of Algorithm 1. Such an extension is crucial, as there are pathological examples where (static) MOSES and iSVD both fail to follow the changes in the distribution of data [29]. This important research direction is left for future work.…”

Section: Prior Artmentioning

confidence: 99%

MOSES: A Streaming Algorithm for Linear Dimensionality Reduction

Eftekhari

Hauser

Grammenos

2020

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

This paper introduces Memory-limited Online Subspace Estimation Scheme (MOSES) for both estimating the principal components of streaming data and reducing its dimension. More specifically, in various applications such as sensor networks, the data vectors are presented sequentially to a user who has limited storage and processing time available. Applied to such problems, MOSES can provide a running estimate of leading principal components of the data that has arrived so far and also reduce its dimension.MOSES generalises the popular incremental Singular Vale Decomposition (iSVD) to handle thin blocks of data, rather than just vectors. This minor generalisation in part allows us to complement MOSES with a comprehensive statistical analysis, thus providing the first theoretically-sound variant of iSVD, which has been lacking despite the empirical success of this method. This generalisation also enables us to concretely interpret MOSES as an approximate solver for the underlying non-convex optimisation program. We find that MOSES consistently surpasses the state of the art in our numerical experiments with both synthetic and real-world datasets, while being computationally inexpensive.

show abstract

Improved Practical Matrix Sketching with Guarantees

Cited by 18 publications

References 63 publications

Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions

Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions

Efficient Frequent Directions Algorithm for Sparse Matrices

MOSES: A Streaming Algorithm for Linear Dimensionality Reduction

Contact Info

Product

Resources

About