2008
DOI: 10.1103/physreve.78.046110
|View full text |Cite
|
Sign up to set email alerts
|

Benchmark graphs for testing community detection algorithms

Abstract: Community structure is one of the most important features of real networks and reveals the internal organization of the nodes. Many algorithms have been proposed but the crucial issue of testing, i.e., the question of how good an algorithm is, with respect to others, is still open. Standard tests include the analysis of simple artificial graphs with a built-in community structure, that the algorithm has to recover. However, the special graphs adopted in actual tests have a structure that does not reflect the r… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

19
1,933
2
32

Year Published

2010
2010
2020
2020

Publication Types

Select...
6
3
1

Relationship

2
8

Authors

Journals

citations
Cited by 2,572 publications
(1,986 citation statements)
references
References 22 publications
19
1,933
2
32
Order By: Relevance
“…In the second approach, algorithms are tested against computer-generated networks that have some form of community structure artificially embedded within them. A number of standard benchmark networks have been proposed for this purpose, such as the 'four groups' networks 14 or so-called the LFR benchmark networks 32 . A number of studies have been published that compare the performance of proposed algorithms in these benchmark tests 33,34 .…”
mentioning
confidence: 99%
“…In the second approach, algorithms are tested against computer-generated networks that have some form of community structure artificially embedded within them. A number of standard benchmark networks have been proposed for this purpose, such as the 'four groups' networks 14 or so-called the LFR benchmark networks 32 . A number of studies have been published that compare the performance of proposed algorithms in these benchmark tests 33,34 .…”
mentioning
confidence: 99%
“…In order to get a grasp on the magnitude of this challenge, we conduct a controlled analysis of topic-model algorithms for highly specified sets of synthetic data. This high degree of control allows us to tease apart the theoretical limitations of the algorithms from other sources of error that would remain uncontrolled with real-world data sets [26][27][28]. Our analyses reveal that standard techniques for likelihood optimization are significantly hindered by the roughness of the likelihood-function landscape, even for very simple cases.…”
Section: Introductionmentioning
confidence: 92%
“…Fortunately, recent work has shown that it is possible to derive this information from the spectra of the non-backtracking matrix [3] and the flow matrix [4], at least on the classic version of the planted partition model [5], where clusters have identical size and nodes the same degree (on average). We show that the prediction of the number of clusters remains accurate as well on the LFR benchmark graph [6], which extends the original planted partition model by introducing realistic features of community structure, i.e. heterogeneous distributions of degrees and cluster sizes.…”
Section: Introductionmentioning
confidence: 98%