2017
DOI: 10.48550/arxiv.1702.03959
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

How large is your graph?

Abstract: We consider the problem of estimating the graph size, where one is given only local access to the graph. We formally define a query model in which one starts with a seed node and is allowed to make queries about neighbours of nodes that have already been seen. In the case of undirected graphs, an estimator of Katzir et al. (2014) based on a sample from the stationary distribution π uses O 1 π 2 *

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…Katzir et al [13] showed, through a variant of collision counting, that taking k = O( √ n + m/n) suffices to estimate n (if one is willing to use more information about the graph, the bound may be improved to k = O π −1 2 +m/n , where π 2 is the Euclidean norm of the stationary distribution π). Kanade et al [12] established a corresponding lower bound for k in this setting. This yields a time complexity of t unif ( √ n + m/n).…”
Section: /4 Relmentioning
confidence: 99%
See 1 more Smart Citation
“…Katzir et al [13] showed, through a variant of collision counting, that taking k = O( √ n + m/n) suffices to estimate n (if one is willing to use more information about the graph, the bound may be improved to k = O π −1 2 +m/n , where π 2 is the Euclidean norm of the stationary distribution π). Kanade et al [12] established a corresponding lower bound for k in this setting. This yields a time complexity of t unif ( √ n + m/n).…”
Section: /4 Relmentioning
confidence: 99%
“…This yields a time complexity of t unif ( √ n + m/n). Kanade et al [12] asked whether the factor t unif in those bounds was really necessary or whether more efficient estimators could be designed. Indeed, in those methods, each unit of information already costs t unif steps.…”
Section: /4 Relmentioning
confidence: 99%
“…) samples to obtain an ǫ approximation with probability at least 1 − δ. It has been shown the number of samples in Katzir's algorithm is necessary ( [12]). The Katzir et al algorithm implies an upper bound of t mix max{ 1…”
Section: Number Of Downloads To Estimate the Number Of Verticesmentioning
confidence: 99%
“…In data mining, one often seeks algorithms that can return (approximate) properties of online social networks, so to study and analyze them, but without having to download the millions, or billions, of vertices that they are made up of. The properties of interest range from the order of the graph [11,12], to its average degree (or its degree distribution) [8][9][10], to the average clustering coefficient [22,24] or triangle counting [16], to non-topological properties such as the average score that the social network's users assign to a movie or a song, or to the fraction of people that like a specific article or page. All these problems have trivial solutions when the graph (with its non-topological attributes) is stored in main memory, or in the disk: choosing a few independent and uniform at random vertices from the graph, and computing their contribution to the (additive) property of interest, is sufficient to estimate the (unknown) value of the graph property -the empirical average of the contributions of the randomly chosen vertices will be close to the right value with high probability, by the central limit theorem.…”
Section: Introductionmentioning
confidence: 99%