Finding communities in complex networks is a challenging task and one promising approach is the Stochastic Block Model (SBM). But the influences from various fields led to a diversity of variants and inference methods. Therefore, a comparison of the existing techniques and an independent analysis of their capabilities and weaknesses is needed. As a first step, we review the development of different SBM variants such as the degree-corrected SBM of Karrer and Newman or Peixoto’s hierarchical SBM. Beside stating all these variants in a uniform notation, we show the reasons for their development. Knowing the variants, we discuss a variety of approaches to infer the optimal partition like the Metropolis-Hastings algorithm. We perform our analysis based on our extension of the Girvan-Newman test and the Lancichinetti-Fortunato-Radicchi benchmark as well as a selection of some real world networks. Using these results, we give some guidance to the challenging task of selecting an inference method and SBM variant. In addition, we give a simple heuristic to determine the number of steps for the Metropolis-Hastings algorithms that lack a usual stop criterion. With our comparison, we hope to guide researches in the field of SBM and highlight the problem of existing techniques to focus future research. Finally, by making our code freely available, we want to promote a faster development, integration and exchange of new ideas.
With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to interpret and understand the decisions of a GNN model. Explanations for a GNN model differ in principle from other input settings. It is important to attribute the decision to input features and other related instances connected by the graph structure. We find that the previous explanation generation approaches that maximize the mutual information between the label distribution produced by the GNN model and the explanation to be restrictive. Specifically, existing approaches do not enforce explanations to be predictive, sparse, or robust to input perturbations.In this paper, we lay down some of the fundamental principles that an explanation method for GNNs should follow and introduce a metric fidelity as a measure of the explanation's effectiveness. We propose a novel approach Zorro based on the principles from rate-distortion theory that uses a simple combinatorial procedure to optimize for fidelity. Extensive experiments on real and synthetic datasets reveal that Zorro produces sparser, stable, and more faithful explanations than existing GNN explanation approaches.
A qualitative analysis of the Hodgkin-Huxley model (Hodgkin and Huxley 1952), which closely mimics the ionic processes at a real nerve membrane, is performed by means of a singular perturbation theory. This was achieved by introducing a perturbation parameter that, if decreased, "speeds up" the fast variables of the Hodgkin-Huxley equations (membrane potential and sodium activation), whereas it does not affect the slow variables (sodium inactivation and potassium activation). In the most extreme case, if the perturbation parameter is set to zero, the original four-dimensional system "degenerates" to a system with only two differential equations. This degenerate system is easier to analyze and much more intuitive than the original Hodgkin-Huxley equations. It shows, like the original model, an infinite train of action potentials if stimulated by an input current in a suitable range. Additionally, explanations for the increased sensitivity to depolarizing current steps that precedes an action potential can be found by analysis of the degenerate system. Using the theory of Mishchenko and Rozov (1980) it is shown that the degenerate system does not only represent a simplification of the original Hodgkin-Huxley equations but also gives a valid approximation of the original model at least for stimulating currents that are constant within a suitable range.
Privacy and interpretability are two of the important ingredients for achieving trustworthy machine learning. We study the interplay of these two aspects in graph machine learning through graph reconstruction attacks. The goal of the adversary here is to reconstruct the graph structure of the training data given access to model explanations. Based on the different kinds of auxiliary information available to the adversary, we propose several graph reconstruction attacks. We show that additional knowledge of post-hoc feature explanations substantially increases the success rate of these attacks. Further, we investigate in detail the differences between attack performance with respect to three different classes of explanation methods for graph neural networks: gradient-based, perturbationbased, and surrogate model-based methods. While gradient-based explanations reveal the most in terms of the graph structure, we find that these explanations do not always score high in utility. For the other two classes of explanations, privacy leakage increases with an increase in explanation utility. Finally, we propose a defense based on a randomized response mechanism for releasing the explanations which substantially reduces the attack success rate. Our anonymized code is available at xxxxxxxxx.
Graph neural networks (GNNs) have achieved great success on various tasks and fields that require relational modeling. GNNs aggregate node features using the graph structure as inductive biases resulting in flexible and powerful models. However, GNNs remain hard to interpret as the interplay between node features and graph structure is only implicitly learned. In this paper, we propose a novel method called KEdge for explicitly sparsifying the underlying graph by removing unnecessary neighbors. Our key idea is based on a tractable method for sparsification using the Hard Kumaraswamy distribution that can be used in conjugation with any GNN model. KEdge learns edge masks in a modular fashion trained with any GNN allowing for gradient-based optimization in an end-to-end fashion. We demonstrate through extensive experiments that our model KEdge can prune a large proportion of the edges with only a minor effect on the test accuracy. Specifically, in the PubMed dataset, KEdge learns to drop more than 80% of the edges with an accuracy drop of merely ≈ 2% showing that graph structure has only a small contribution in comparison to node features. Finally, we also show that KEdge effectively counters the over-smoothing phenomena in deep GNNs by maintaining good task performance with increasing GNN layers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.