We show how to approximate a data matrix A with a much smaller sketchà that can be used to solve a general class of constrained k-rank approximation problems to within (1 + ) error. Importantly, this class includes k-means clustering and unconstrained low rank approximation (i.e. principal component analysis). By reducing data points to just O(k) dimensions, we generically accelerate any exact, approximate, or heuristic algorithm for these ubiquitous problems.For k-means dimensionality reduction, we provide (1 + ) relative error results for many common sketching techniques, including random row projection, column selection, and approximate SVD. For approximate principal component analysis, we give a simple alternative to known algorithms that has applications in the streaming setting. Additionally, we extend recent work on column-based matrix reconstruction, giving column subsets that not only 'cover' a good subspace for A, but can be used directly to compute this subspace.Finally, for k-means clustering, we show how to achieve a (9 + ) approximation by Johnson-Lindenstrauss projecting data to just O(log k/ 2 ) dimensions. This is the first result that leverages the specific structure of k-means to achieve dimension independent of input size and sublinear in k.
The rise of social media and online social networks has been a disruptive force in society. Opinions are increasingly shaped by interactions on online social media, and social phenomena including disagreement and polarization are now tightly woven into everyday life. In this work we initiate the study of the following question:Given n agents, each with its own initial opinion that reflects its core value on a topic, and an opinion dynamics model, what is the structure of a social network that minimizes polarization and disagreement simultaneously?This question is central to recommender systems: should a recommender system prefer a link suggestion between two online users with similar mindsets in order to keep disagreement low, or between two users with different opinions in order to expose each to the other's viewpoint of the world, and decrease overall levels of polarization? Such decisions have an important global effect on society [51]. Our contributions include a mathematical formalization of this question as an optimization problem and an exact, time-efficient algorithm. We also prove that there always exists a network with O(n/ 2 ) edges that is a (1 + ) approximation to the optimum. Our formulation is an instance of optimization over graph topologies, also considered e.g., in [7,12,48]. For a fixed graph, we additionally show how to optimize our objective function over the agents' innate opinions in polynomial time.We perform an empirical study of our proposed methods on synthetic and real-world data that verify their value as mining tools to better understand the trade-off between of disagreement and polarization. We find that there is a lot of space to reduce both polarization and disagreement in real-world networks; for instance, on a Reddit network where users exchange comments on politics, our methods achieve a ∼ 60 000-fold reduction in polarization and disagreement. Our code is available at https://github.com/tsourolampis/polarization-disagreement.
No abstract
Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances.We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear regression directly, it is enough to compute a better approximation.This observation leads to simple iterative row sampling algorithms for matrix approximation that run in input-sparsity time and preserve row structure and sparsity at all intermediate steps.In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows.
We present the first single pass algorithm for computing spectral sparsifiers of graphs in the dynamic semi-streaming model. Given a single pass over a stream containing insertions and deletions of edges to a graph G, our algorithm maintains a randomized linear sketch of the incidence matrix of G into dimension O( 1 ǫ 2 n polylog(n)). Using this sketch, at any point, the algorithm can output a (1 ± ǫ) spectral sparsifier for G with high probability.While O( 1 ǫ 2 n polylog(n)) space algorithms are known for computing cut sparsifiers in dynamic streams [AGM12b, GKP12] and spectral sparsifiers in insertion-only streams [KL11], prior to our work, the best known single pass algorithm for maintaining spectral sparsifiers in dynamic streams required sketches of dimension Ω(. To achieve our result, we show that, using a coarse sparsifier of G and a linear sketch of G's incidence matrix, it is possible to sample edges by effective resistance, obtaining a spectral sparsifier of arbitrary precision. Sampling from the sketch requires a novel application of ℓ 2 /ℓ 2 sparse recovery, a natural extension of the ℓ 0 methods used for cut sparsifiers in [AGM12b]. Recent work of [MP12] on row sampling for matrix approximation gives a recursive approach for obtaining the required coarse sparsifiers.Under certain restrictions, our approach also extends to the problem of maintaining a spectral approximation for a general matrix A ⊤ A given a stream of updates to rows in A.
Theoretically elegant and ubiquitous in practice, the Lanczos method can approximate f (A)x for any symmetric matrix A ∈ R n×n , vector x ∈ R n , and function f . In exact arithmetic, the method's error after k iterations is bounded by the error of the best degree-k polynomial uniformly approximating the scalar function f (x) on the range [λ min (A), λ max (A)]. However, despite decades of work, it has been unclear if this powerful guarantee holds in finite precision.We resolve this problem, proving that when max x∈[λ min ,λ max ] |f (x)| ≤ C, Lanczos essentially matches the exact arithmetic guarantee if computations use roughly log(nC A ) bits of precision. Our proof extends work of Druskin and Knizhnerman [11], leveraging the stability of the classic Chebyshev recurrence to bound the stability of any polynomial approximating f (x).We also study the special case of f (A) = A −1 for positive definite A, where stronger guarantees hold for Lanczos. In exact arithmetic the algorithm performs as well as the best polynomial approximating 1/x at each of A's eigenvalues, rather than on the full range [λ min (A), λ max (A)]. In seminal work, Greenbaum gives a natural approach to extending this bound to finite precision: she proves that finite precision Lanczos and the related conjugate gradient method match any polynomial approximating 1/x in a tiny range around each eigenvalue [17].For A −1 , Greenbaum's bound appears stronger than our result. However, we exhibit matrices with condition number κ where exact arithmetic Lanczos converges in polylog(κ) iterations, but Greenbaum's bound predicts at best Ω(κ 1/5 ) iterations in finite precision. It thus cannot offer more than a polynomial improvement over the O(κ 1/2 ) bound achievable via our result for general f (A). Our analysis bounds the power of stable approximating polynomials and raises the question of if they fully characterize the behavior of finite precision Lanczos in solving linear systems. If they do, convergence in less than poly(κ) iterations cannot be expected, even for matrices with clustered, skewed, or otherwise favorable eigenvalue distributions.
We present a new algorithm for finding a near optimal low-rank approximation of a matrix A in O(nnz(A)) time. Our method is based on a recursive sampling scheme for computing a representative subset of A's columns, which is then used to find a low-rank approximation.This approach differs substantially from prior O(nnz(A)) time algorithms, which are all based on fast Johnson-Lindenstrauss random projections.Our algorithm matches the guarantees of the random projection methods while offering a number of advantages.In addition to better performance on sparse and structured data, sampling algorithms can be applied in settings where random projections cannot. For example, we give new streaming algorithms for the column subset selection and projection-cost preserving sample problems. Our method has also been used in the fastest algorithms for provably accurate Nyström approximation of kernel matrices [56].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.