Abstract:Abstract. Modularity allows to estimate the quality of a partition into communities of a graph composed of highly inter-connected vertices. In this article, we introduce a complementary measure, based on inertia, and specially conceived to evaluate the quality of a partition based on real attributes describing the vertices. We propose also I-Louvain, a graph nodes clustering method which uses our criterion, combined with Newman's modularity, in order to detect communities in attributed graph where real attribu… Show more
“…These methods incorporate attribute information into an optimization objective like the modularity. [5] injects an attribute based similarity measure into the modularity function; [1] combines the gain in the modularity with multiple common users' attributes as an integrated objective; I-Louvain algorithm [3] proposes inertia-based modularity to describe the similarity between nodes with numeric attributes, and adds the inertia-based modularity to the original modularity formula to form the new optimization objective.…”
The majority of research on community detection in attributed networks follows an "early fusion" approach, in which the structural and attribute information about the network are integrated together as the guide to community detection. In this paper, we propose an approach called late-fusion, which looks at this problem from a different perspective. We first exploit the network structure and node attributes separately to produce two different partitionings. Later on, we combine these two sets of communities via a fusion algorithm, where we introduce a parameter for weighting the importance given to each type of information: node connections and attribute values. Extensive experiments on various real and synthetic networks show that our latefusion approach can improve detection accuracy from using only network structure. Moreover, our approach runs significantly faster than other attributed community detection algorithms including early fusion ones.
“…These methods incorporate attribute information into an optimization objective like the modularity. [5] injects an attribute based similarity measure into the modularity function; [1] combines the gain in the modularity with multiple common users' attributes as an integrated objective; I-Louvain algorithm [3] proposes inertia-based modularity to describe the similarity between nodes with numeric attributes, and adds the inertia-based modularity to the original modularity formula to form the new optimization objective.…”
The majority of research on community detection in attributed networks follows an "early fusion" approach, in which the structural and attribute information about the network are integrated together as the guide to community detection. In this paper, we propose an approach called late-fusion, which looks at this problem from a different perspective. We first exploit the network structure and node attributes separately to produce two different partitionings. Later on, we combine these two sets of communities via a fusion algorithm, where we introduce a parameter for weighting the importance given to each type of information: node connections and attribute values. Extensive experiments on various real and synthetic networks show that our latefusion approach can improve detection accuracy from using only network structure. Moreover, our approach runs significantly faster than other attributed community detection algorithms including early fusion ones.
“…Quality function based methods define a quantity of interest that an ideal partition would satisfy, while probabilistic methods identify communities through likelihood optimization and focus on the underlying statistical distribution for the observed network. A recent quality function-based method to handle multiple attributes is I-louvain [15]. This method approaches the problem as an extension to the Louvain algorithm, which is the state-of-the-art scalable modularity quality function community detection method [16].…”
Section: A Related Work In Attributed Networkmentioning
The stochastic block model (SBM) is a probabilistic model for community structure in networks. Typically, only the adjacency matrix is used to perform SBM parameter inference. In this paper, we consider circumstances in which nodes have an associated vector of continuous attributes that are also used to learn the node-to-community assignments and corresponding SBM parameters. While this assumption is not realistic for every application, our model assumes that the attributes associated with the nodes in a network's community can be described by a common multivariate Gaussian model. In this augmented, attributed SBM, the objective is to simultaneously learn the SBM connectivity probabilities with the multivariate Gaussian parameters describing each community. While there are recent examples in the literature that combine connectivity and attribute information to inform community detection, our model is the first augmented stochastic block model to handle multiple continuous attributes. This provides the flexibility in biological data to, for example, augment connectivity information with continuous measurements from multiple experimental modalities. Because the lack of labeled network data often makes community detection results difficult to validate, we highlight the usefulness of our model for two network prediction tasks: link prediction and collaborative filtering. As a result of fitting this attributed stochastic block model, one can predict the attribute vector or connectivity patterns for a new node in the event of the complementary source of information (connectivity or attributes, respectively). We also highlight two biological examples where the attributed stochastic block model provides satisfactory performance in the link prediction and collaborative filtering tasks.
“…Various definition of hybrid objective functions and efficient ways to find optimal solutions have been proposed. In most case the result is a set of non overlapping communities (Baroni et al 2017;Sánchez et al 2015;Combe et al 2015). The overlapping case has been addressed by soft clustering schemes (Xu et al 2012), by hard clustering of the edge set (Galbrun et al 2014) or by building generative models in such a way that a node may freely belong to several communities (Yang et al 2013).…”
Applying closed pattern mining to attributed two-mode networks requires two conditions. First, as in two-mode networks there are two kinds of vertices, each described with a proper attribute set, we have to consider patterns made of two components that we call bi-patterns. The occurrences of a bi-pattern forms an extension made of a pair of vertex subsets. Second, Formal Concept Analysis and Closed Pattern Mining were recently applied to networks by reducing the extensions of pattern to their cores, according to some core definition. We need to consider appropriate core definitions for two-mode networks and define accordingly closed bi-patterns. We describe in this article a general framework to define closed bi-pattern mining. We also show that this methodology applies as well to cores of directed and undirected networks in which each vertex subset is associated with a specific role. We illustrate the methodology first on a two-mode network of epistemological data, then on a directed advice network of lawyers and finally on an undirected bibliographical network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.