Sublinear estimation of a single element in sparse linear systems

Shyamkumar, Nitin; Banerjee, Siddhartha; Lofgren, Peter

doi:10.1109/allerton.2016.7852323

Cited by 7 publications

(6 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…1 , which by Lemma 16 with probability 1 − δ 2 is within a multiplicative (1 ± ǫ) of its own expectation for all u ∈ B. We thus get a multiplicative (1±ǫ)-approximation of the rightmost summation in Equation 53, while for the remaining terms we use the concentration bounds of Subsection A.12.5, as for PageRank.…”

Section: A122 Subgraph Estimatorsmentioning

confidence: 98%

“…Finally, we shall mention recent work on the local approximation of the stationary probability of a target state v in a Markov Chain [44,8,18], and on the local approximation of a single entry of the solution vector of a linear system [45,53]. The local approximation of P (v) is a specific but nontrivial case of both, and we hope that our techniques may serve as an entry point for future developments in those directions.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Sublinear Algorithms for Local Graph Centrality Estimation

Bressan

Peserico

Pretto

2018

2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)

View full text Add to dashboard Cite

We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of m arcs, with probability (1 − δ) computes a multiplicative (1±ǫ)-approximation of its score by examining onlyÕ min m 2/3 ∆ 1/3 d −2/3 , m 4/5 d −3/5 nodes/arcs, where ∆ and d are respectively the maximum and average outdegree of the graph (omitting for readability poly(ǫ −1 ) and polylog(δ −1 ) factors). A similar bound holds for computational complexity. We also prove a lower bound of Ω min m 1/2 ∆ 1/2 d −1/2 , m 2/3 d −1/3 for both query complexity and computational complexity. Moreover, our technique yields aÕ(n 2/3 ) query complexity algorithm for the graph access model of Brautbar et al. [14], widely used in social network mining; we show this algorithm is optimal up to a sublogarithmic factor. These are the first algorithms yielding worst-case sublinear bounds for general directed graphs and any choice of the target node. * This is the full version of a paper accepted for publication at IEEE FOCS 2018. IntroductionComputing graph centralities efficiently is essential to modern network analysis. With the advent of web and social networks, the prototypical scenario involves massive graphs on millions or even billions of nodes and arcs. On these inputs graphs, traditional approaches such as Monte Carlo simulations and algebraic techniques are often impractical -if not entirely useless -since their cost can scale linearly or superlinearly with the size of the graph. An alternative approach is that of local graph algorithms, that, broadly speaking, work by exploring only a small portion of the graph around a given target node. Local algorithms are justified by the fact that, often, one does not need an exact computation of the entire score vector, but only a quick approximation for a few nodes of interest. Obviously, in exchange one hopes to drastically reduce both the running time and the portion of the graph to be fetched. One of the best-known examples is perhaps local graph clustering [4,54,33].In this paper we address the problem of locally approximating the centrality score of a node in a graph, focusing on the PageRank and heat kernel centralities. PageRank [20] is a classic graph centrality measure with a vast number of applications including local graph clustering [4], trendsetter identification [52], spam filtering [40], link prediction [39] and many more (see [35] and [23]); it has been named one of the top 10 algorithms in data mining [55]. Heat kernel [24] can be seen as a variant of PageRank that satisfies the heat equation. Its applications span biological network analysis [31,30] and solving local linear systems...

show abstract

Section: A122 Subgraph Estimatorsmentioning

confidence: 98%

Section: Related Workmentioning

confidence: 99%

Sublinear Algorithms for Local Graph Centrality Estimation

Bressan

Peserico

Pretto

2018

2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)

View full text Add to dashboard Cite

show abstract

“…This method has been independently studied for the specific setting of computing Pagerank, Andersen et al proposed an iterative method which relies on the conditions that G is a nonnegative scaled stochastic matrix, z is entry-wise positive and bounded strictly away from zero, and the solution x is a probability vector (i.e., consisting of nonnegative entries that sum to 1) [23]. There has been subsequent follow up work which builds upon an earlier version of our paper to design bidirectional local algorithms that combine both iterative algorithms and Monte Carlo methods [24], [25].…”

Section: Local Algorithmsmentioning

confidence: 99%

“…into(25) to show thatEP r (t+1) 2 (t) = r (t)T I − 2I−G−G T −D min(td,n) r (t) . (29)We substitute(29) and Lemma 11.1a into(24) to show thatEP r (t+1) − EP r (t+1) 2 (t) ,= r (t)T I − 2I−G−G T −D min(td,n) T min(td,n) r (t) , = r (t)T D min(td,n) − (I−G)(I−G T ) min(td,n) 2 r (t) , ≤ D min(td,n) − (I−G)(I−G T ) min(td,n) 2…”

mentioning

confidence: 99%

Asynchronous Approximation of a Single Component of the Solution to a Linear System

Ozdaglar

Shah

2020

IEEE Trans. Netw. Sci. Eng.

View full text Add to dashboard Cite

We present a distributed asynchronous algorithm for approximating a single component of the solution to a system of linear equations Ax = b, where A is a positive definite real matrix and b ∈ R n . This can equivalently be formulated as solving for x i in x = Gx + z for some G and z such that the spectral radius of G is less than 1. Our algorithm relies on the Neumann series characterization of the component x i , and is based on residual updates. We analyze our algorithm within the context of a cloud computation model motivated by frameworks such as Apache Spark, in which the computation is split into small update tasks performed by small processors with shared access to a distributed file system. We prove a robust asymptotic convergence result when the spectral radius ρ(|G|) < 1, regardless of the precise order and frequency in which the update tasks are performed. We provide convergence rate bounds which depend on the order of update tasks performed, analyzing both deterministic update rules via counting weighted random walks, as well as probabilistic update rules via concentration bounds. The probabilistic analysis requires analyzing the product of random matrices which are drawn from distributions that are time and path dependent. We specifically consider the setting where n is large, yet G is sparse, e.g., each row has at most d nonzero entries. This is motivated by applications in which G is derived from the edge structure of an underlying graph. Our results prove that if the local neighborhood of the graph does not grow too quickly as a function of n, our algorithm can provide significant reduction in computation cost as opposed to any algorithm which computes the global solution vector x. Our algorithm obtains an x 2 additive approximation for x i in constant time with respect to the size of the matrix when the maximum row sparsity d = O(1) and 1/(1 − G 2 ) = O(1), where G 2 is the induced matrix operator 2-norm.Index Terms-linear system of equations, local computation, asynchronous randomized algorithms, distributed algorithms ! arXiv:1411.2647v4 [cs.DS] 21 Jan 2019 1. By optimal, we would like to minimize ρ(G), which maximizes the convergence rate of the algorithm.

show abstract

“…via the power method [10], can be simply infeasible. As an alternative one can then resort to approximating only individual entries of the vector, in exchange for a much lower computational complexity [13,20]. In fact, if such a complexity is low enough one could efficiently "sketch" the whole vector by quickly getting a fair estimate of its entries.…”

Section: Introductionmentioning

confidence: 99%

On Approximating the Stationary Distribution of Time-Reversible Markov Chains

2019

View full text Add to dashboard Cite

Approximating the stationary probability of a state in a Markov chain through Markov chain Monte Carlo techniques is, in general, inefficient. Standard random walk approaches requireÕ(τ /π(v)) operations to approximate the probability π(v) of a state v in a chain with mixing time τ , and even the best available techniques still have complexityÕ(τ 1.5 /π(v) 0.5 ); and since these complexities depend inversely on π(v), they can grow beyond any bound in the size of the chain or in its mixing time. In this paper we show that, for time-reversible Markov chains, there exists a simple randomized approximation algorithm that breaks this "small-π(v) barrier".

show abstract

Sublinear estimation of a single element in sparse linear systems

Cited by 7 publications

References 10 publications

Sublinear Algorithms for Local Graph Centrality Estimation

Sublinear Algorithms for Local Graph Centrality Estimation

Asynchronous Approximation of a Single Component of the Solution to a Linear System

On Approximating the Stationary Distribution of Time-Reversible Markov Chains

Contact Info

Product

Resources

About