In this work, we study the correlation between attribute sets and the occurrence of dense subgraphs in large attributed graphs, a task we call structural correlation pattern mining. A structural correlation pattern is a dense subgraph induced by a particular attribute set. Existing methods are not able to extract relevant knowledge regarding how vertex attributes interact with dense subgraphs. Structural correlation pattern mining combines aspects of frequent itemset and quasi-clique mining problems. We propose statistical significance measures that compare the structural correlation of attribute sets against their expected values using null models. Moreover, we evaluate the interestingness of structural correlation patterns in terms of size and density. An efficient algorithm that combines search and pruning strategies in the identification of the most relevant structural correlation patterns is presented. We apply our method for the analysis of three real-world attributed graphs: a collaboration, a music, and a citation network, verifying that it provides valuable knowledge in a feasible time.
Abstract. How are personal interests related to the communities in large social networks? In order to answer this kind of question, we introduce structural correlation pattern mining, which is the identification of interesting associations between vertex attributes and dense subgraphs. We present a model and algorithms that explore search, pruning, sampling and parallelization strategies to solve this problem for large graphs. Results show that structural correlation pattern mining enables the discovery of relevant patterns in real-life datasets.
Modeling information that resides on vertices of large graphs is a key problem in several real-life applications, ranging from social networks to the Internet-of-things. Signal Processing on Graphs and, in particular, graph wavelets can exploit the intrinsic smoothness of these datasets in order to represent them in a both compact and accurate manner. However, how to discover wavelet bases that capture the geometry of the data with respect to the signal as well as the graph structure remains an open question. In this paper, we study the problem of computing graph wavelet bases via sparse cuts in order to produce low-dimensional encodings of data-driven bases. This problem is connected to known hard problems in graph theory (e.g. multiway cuts) and thus requires an efficient heuristic. We formulate the basis discovery task as a relaxation of a vector optimization problem, which leads to an elegant solution as a regularized eigenvalue computation. Moreover, we propose several strategies in order to scale our algorithm to large graphs. Experimental results show that the proposed algorithm can effectively encode both the graph structure and signal, producing compressed and accurate representations for vertex values in a wide range of datasets (e.g. sensor and gene networks) and significantly outperforming the best baseline. * This is a long version of a paper with the same title published at SIGKDD'16 ACM ISBN 978-1-4503-2138-9.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.