Anomaly Detection in the Dynamics of Web and Social Networks Using Associative Memory

Miz, Volodymyr; Benjamin, Ricaud; Benzi, Kirell

doi:10.1145/3308558.3313541

Cited by 17 publications

(15 citation statements)

References 33 publications

(37 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The structural patterns are captured through connectivity [4], sketch [26], dense subgraphs [5], [27], [28], and so on. Some works design approximation mechanisms of the structural patterns to achieve near real-time pattern updating [5], [6], [29].…”

Section: B Anomaly Detection In Dynamic Networkmentioning

confidence: 99%

H-VGRAE: A Hierarchical Stochastic Spatial-Temporal Embedding Method for Robust Anomaly Detection in Dynamic Networks

Yang¹,

Zhou²,

Wen³

et al. 2020

Preprint

View full text Add to dashboard Cite

Detecting anomalous edges and nodes in dynamic networks is critical in various areas, such as social media, computer networks, and so on. Recent approaches leverage network embedding technique to learn how to generate node representations for normal training samples and detect anomalies deviated from normal patterns. However, most existing network embedding approaches learn deterministic node representations, which are sensitive to fluctuations of the topology and attributes due to the high flexibility and stochasticity of dynamic networks. In this paper, a stochastic neural network, named by Hierarchical Variational Graph Recurrent Autoencoder (H-VGRAE), is proposed to detect anomalies in dynamic networks by the learned robust node representations in the form of random variables. H-VGRAE is a semi-supervised model to capture normal patterns in training set by maximizing the likelihood of the adjacency matrix and node attributes via variational inference. The encoder of the H-VGRAE encodes hierarchical spatial-temporal information of topology and node attribute into multi-layer conditional random variables, and then the decoder reconstructs the dynamic network based on the latent random variables. For a new observation of the dynamic network, the reconstruction probabilities of edges and node attributes can be obtained from the trained H-VGRAE, and those with low reconstruction probabilities are declared as anomalous. Comparing with existing methods, H-VGRAE has three main advantages: 1) H-VGRAE learns robust node representations through stochasticity modeling and the extraction of multi-scale spatial-temporal features; 2) H-VGRAE can be extended to deep structure with the increase of the dynamic network scale; 3) the anomalous edge and node can be located and interpreted from the probabilistic perspective. Extensive experiments on four realworld datasets demonstrate the outperformance of H-VGRAE on anomaly detection in dynamic networks compared with state-ofthe-art competitors.

show abstract

Section: B Anomaly Detection In Dynamic Networkmentioning

confidence: 99%

H-VGRAE: A Hierarchical Stochastic Spatial-Temporal Embedding Method for Robust Anomaly Detection in Dynamic Networks

Yang¹,

Zhou²,

Wen³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…It was shown that real-world events can be detected and tracked using Wikipedia viewership data. Besides, it is possible to use this data to detect abnormal patterns of visits in groups of connected Wikipedia articles [8]. For example, popular sports events such as Super Bowl, NBA playoffs, and FIFA World Cup, can be detected just by looking at Wikipedia viewership dynamics and the hyperlinks structure.…”

Section: Combining the Hyperlink Graph And Time Series Of Visitsmentioning

confidence: 99%

“…Fig. 3 illustrates the result of the anomaly detection algorithm presented in [8]. Using the combination of the hyperlink network and the visitors' activity, the authors detected the groups of articles with simultaneous spikes in viewership dynamics.…”

Section: Combining the Hyperlink Graph And Time Series Of Visitsmentioning

confidence: 99%

“…The emerging field of spatio-temporal data mining [2] highlighted an increasing interest and a need for reproducible network datasets that contain dynamically changing components. Miz et al [8] adopted an anomaly detection approach on graphs to analyze the visitors' activity in relation to real-world events.…”

Section: Introductionmentioning

confidence: 99%

“…Network view of a reduced subset of Wikipedia web pages (∼20K nodes, ∼100K edges) using the method described in[8]. Nodes correspond to popular articles with spikes of visits during the period Oct. 2014 -Apr.…”

mentioning

confidence: 99%

See 2 more Smart Citations

A Graph-Structured Dataset for Wikipedia Research

Aspert

Miz

Ricaud

2019

Companion Proceedings of the 2019 World Wide Web Conference

Self Cite

View full text Add to dashboard Cite

Wikipedia is a rich and invaluable source of information. Its central place on the Web makes it a particularly interesting object of study for scientists. Researchers from different domains used various complex datasets related to Wikipedia to study language, social behavior, knowledge organization, and network theory. While being a scientific treasure, the large size of the dataset hinders preprocessing and may be a challenging obstacle for potential new studies. This issue is particularly acute in scientific domains where researchers may not be technically and data processing savvy. On one hand, the size of Wikipedia dumps is large. It makes the parsing and extraction of relevant information cumbersome. On the other hand, the API is straightforward to use but restricted to a relatively small number of requests. The middle ground is at the mesoscopic scale, when researchers need a subset of Wikipedia ranging from thousands to hundreds of thousands of pages but there exists no efficient solution at this scale.In this work, we propose an efficient data structure to make requests and access subnetworks of Wikipedia pages and categories. We provide convenient tools for accessing and filtering viewership statistics or "pagecounts" of Wikipedia web pages. The dataset organization leverages principles of graph databases that allows rapid and intuitive access to subgraphs of Wikipedia articles and categories. The dataset and deployment guidelines are available on the LTS2 website https://lts2.epfl.ch/Datasets/Wikipedia/.

show abstract