How large is your graph?

Kanade, Varun; Mallmann-Trenn, Frederik; Verdugo, Victor

doi:10.48550/arxiv.1702.03959

Cited by 2 publications

(4 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Katzir et al [13] showed, through a variant of collision counting, that taking k = O( √ n + m/n) suffices to estimate n (if one is willing to use more information about the graph, the bound may be improved to k = O π −1 2 +m/n , where π 2 is the Euclidean norm of the stationary distribution π). Kanade et al [12] established a corresponding lower bound for k in this setting. This yields a time complexity of t unif ( √ n + m/n).…”

Section: /4 Relmentioning

confidence: 99%

See 1 more Smart Citation

Estimating graph parameters with random walks

2019

View full text Add to dashboard Cite

An algorithm observes the trajectories of random walks over an unknown graph G, starting from the same vertex x, as well as the degrees along the trajectories. For all finite connected graphs, one can estimate the number of edges m up to a bounded factor in O t 3/4 rel m/d steps, where t rel is the relaxation time of the lazy random walk on G and d is the minimum degree in G. Alternatively, m can be estimated in O t unif + t 5/6 rel √ n , where n is the number of vertices and t unif is the uniform mixing time on G. The number of vertices n can then be estimated up to a bounded factor in an additional O t unif m nsteps. Our algorithms are based on counting the number of intersections of random walk paths X, Y , i.e. the number of pairs (t, s) such that Xt = Ys. This improves on previous estimates which only consider collisions (i.e., times t with Xt = Yt). We also show that the complexity of our algorithms is optimal, even when restricting to graphs with a prescribed relaxation time. Finally, we show that, given either m or the mixing time of G, we can compute the "other parameter" with a self-stopping algorithm.2010 Mathematics Subject Classification. 60J10, 05C81, 05C85, 62M05.

show abstract

Section: /4 Relmentioning

confidence: 99%

“…This yields a time complexity of t unif ( √ n + m/n). Kanade et al [12] asked whether the factor t unif in those bounds was really necessary or whether more efficient estimators could be designed. Indeed, in those methods, each unit of information already costs t unif steps.…”

Section: /4 Relmentioning

confidence: 99%

Estimating graph parameters with random walks

2019

View full text Add to dashboard Cite

show abstract

“…) samples to obtain an ǫ approximation with probability at least 1 − δ. It has been shown the number of samples in Katzir's algorithm is necessary ( [12]). The Katzir et al algorithm implies an upper bound of t mix max{ 1…”

Section: Number Of Downloads To Estimate the Number Of Verticesmentioning

confidence: 99%

“…In data mining, one often seeks algorithms that can return (approximate) properties of online social networks, so to study and analyze them, but without having to download the millions, or billions, of vertices that they are made up of. The properties of interest range from the order of the graph [11,12], to its average degree (or its degree distribution) [8][9][10], to the average clustering coefficient [22,24] or triangle counting [16], to non-topological properties such as the average score that the social network's users assign to a movie or a song, or to the fraction of people that like a specific article or page. All these problems have trivial solutions when the graph (with its non-topological attributes) is stored in main memory, or in the disk: choosing a few independent and uniform at random vertices from the graph, and computing their contribution to the (additive) property of interest, is sufficient to estimate the (unknown) value of the graph property -the empirical average of the contributions of the randomly chosen vertices will be close to the right value with high probability, by the central limit theorem.…”

Section: Introductionmentioning

confidence: 99%

On the Complexity of Sampling Nodes Uniformly from a Graph

Chierichetti¹,

Haddadan²

2017

Preprint

View full text Add to dashboard Cite

We study a number of graph exploration problems in the following natural scenario: an algorithm starts exploring an undirected graph from some seed vertex; the algorithm, for an arbitrary vertex v that it is aware of, can ask an oracle to return the set of the neighbors of v. (In the case of social networks, a call to this oracle corresponds to downloading the profile page of user v.) The goal of the algorithm is to either learn something (e.g., average degree) about the graph, or to return some random function of the graph (e.g., a uniform-at-random vertex), while accessing/downloading as few vertices of the graph as possible.Motivated by practical applications, we study the complexities of a variety of problems in terms of the graph's mixing time t mix and average degree d avg -two measures that are believed to be quite small in real-world social networks, and that have often been used in the applied literature to bound the performance of online exploration algorithms.Our main result is that the algorithm has to access Ω t mix d avg ǫ −2 ln δ −1 vertices to obtain, with probability at least 1 − δ, an ǫ additive approximation of the average of a bounded function on the vertices of a graph -this lower bound matches the performance of an algorithm that was proposed in the literature.We also give tight bounds for the problem of returning a close-to-uniform-at-random vertex from the graph. Finally, we give lower bounds for the problems of estimating the average degree of the graph, and the number of vertices of the graph.

show abstract

How large is your graph?

Cited by 2 publications

References 14 publications

Estimating graph parameters with random walks

Estimating graph parameters with random walks

On the Complexity of Sampling Nodes Uniformly from a Graph

Contact Info

Product

Resources

About