2007
DOI: 10.1073/pnas.0610537104
|View full text |Cite
|
Sign up to set email alerts
|

Mixture models and exploratory analysis in networks

Abstract: Networks are widely used in the biological, physical, and social sciences as a concise mathematical representation of the topology of systems of interacting components. Understanding the structure of these networks is one of the outstanding challenges in the study of complex systems. Here we describe a general technique for detecting structural features in large-scale network data that works by dividing the nodes of a network into classes such that the members of each class have similar patterns of connection … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

3
423
0
7

Year Published

2008
2008
2012
2012

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 477 publications
(436 citation statements)
references
References 28 publications
3
423
0
7
Order By: Relevance
“…On the other hand, Guimerà, Sales-Pardo, and Amaral [85] and Fortunato and Barthélemy [73] showed that random graphs have high-modularity subsets and that there exists a size scale below which communities cannot be identified. In part as a response to this, some recent work has had a more statistical flavor [86,140,144,94,133]. In light of our results, this work seems promising, both due to potential "overfitting" issues arising from the extreme sparsity of the networks, and also due to the empirically-promising regularization properties exhibited by local spectral methods.…”
Section: Relationship With Community Identification Methodsmentioning
confidence: 77%
“…On the other hand, Guimerà, Sales-Pardo, and Amaral [85] and Fortunato and Barthélemy [73] showed that random graphs have high-modularity subsets and that there exists a size scale below which communities cannot be identified. In part as a response to this, some recent work has had a more statistical flavor [86,140,144,94,133]. In light of our results, this work seems promising, both due to potential "overfitting" issues arising from the extreme sparsity of the networks, and also due to the empirically-promising regularization properties exhibited by local spectral methods.…”
Section: Relationship With Community Identification Methodsmentioning
confidence: 77%
“…Runtime for the main loop in MATLAB on a 2 GHz laptop is ~6 min for N = 10 6 nodes with average degree 16 and K = 4.] Furthermore, we note that previous methods in which parameter inference is performed by optimizing a likelihood function via expectation maximization (EM) [11,18] are also special cases of the framework presented here. EM is a limiting case of VB in which one collapses the distributions over parameters to point estimates at the mode of each distribution; however EM is prone to over-fitting and cannot be used to determine the appropriate number of modules, as the likelihood of observed data increases with the number of modules in the model.…”
Section: Main Loopmentioning
confidence: 99%
“…Update the variational distribution over parameters from the expected counts and pseudocounts (11) (12) (15) where C = N(N − 1)/2, M =Σ i>j A ij , and u⃗ is a N-by-1 vector of 1's;…”
Section: Main Loopmentioning
confidence: 99%
“…Along with the rapid development of network clustering techniques, the ability of revealing overlaps between communities has become very important as well [86,9,39,83,31,89,57,71,52]. Indeed, communities in realworld graphs are often inherently overlapping: each person in a social web belongs usually to several groups (family, colleagues, friends, etc.…”
Section: Applications: Community Finding and Clusteringmentioning
confidence: 99%