2004
DOI: 10.1073/pnas.0307750100
|View full text |Cite
|
Sign up to set email alerts
|

Tracking evolving communities in large linked networks

Abstract: We are interested in tracking changes in large-scale data by periodically creating an agglomerative clustering and examining the evolution of clusters (communities) over time. We examine a large real-world data set: the NEC CiteSeer database, a linked network of >250,000 papers. Tracking changes over time requires a clustering algorithm that produces clusters stable under small perturbations of the input data. However, small perturbations of the CiteSeer data lead to significant changes to most of the clusters… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
160
0

Year Published

2006
2006
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 237 publications
(160 citation statements)
references
References 13 publications
0
160
0
Order By: Relevance
“…The number of families present in the spoligotype data and the probability distribution for each of them were estimated using the Monte Carlo cross-validation (MCCV) technique, which was developed to extract as much information from the data as possible, without any prior knowledge (Smyth, 1996). We used the stability, or average best match (Hopcroft et al, 2004), and the log-likelihood to choose a final mixture model. The results were compared to the families that have been identified using the prototypes extracted from the SpolDB3 database (Filliol et al, 2002).…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The number of families present in the spoligotype data and the probability distribution for each of them were estimated using the Monte Carlo cross-validation (MCCV) technique, which was developed to extract as much information from the data as possible, without any prior knowledge (Smyth, 1996). We used the stability, or average best match (Hopcroft et al, 2004), and the log-likelihood to choose a final mixture model. The results were compared to the families that have been identified using the prototypes extracted from the SpolDB3 database (Filliol et al, 2002).…”
Section: Methodsmentioning
confidence: 99%
“…We chose a final mixture model based on the total stability (Hopcroft et al, 2004) and the total loglikelihood. We call the stability of a set, or putative family, of spoligotypes the average best match between this set and the sets identified using other models.…”
Section: Model Initialization and Validationmentioning
confidence: 99%
See 1 more Smart Citation
“…Its basic idea is to apply a static clustering method to each snapshot independently and capture the evolution of the communities by comparing the clustering of the consecutive snapshots. Based on this approach, Hopcroft et al [30] were of the first authors who applied a static clustering on each snapshots and they tracked communities over time by the clusters similarities over consecutive timesteps. They showed that even small perturbations in the graph could lead significant changes in the structure of the detected communities.…”
Section: B Community Detection In Dynamic Networkmentioning
confidence: 99%
“…Apart from that, there have been attempts to track communities over time and interpret their evolution, using static snapshots of the network, e.g. [9,10], besides an array of case studies. In [11] a parameter-based dynamic graph clustering method is proposed which allows user exploration.…”
Section: Introductionmentioning
confidence: 99%