This paper shows the feasibility of utilizing the Kernel Spectral Clustering (KSC) method for the purpose of community detection in big data networks. KSC employs a primal-dual framework to construct a model. It results in a powerful property of effectively inferring the community affiliation for out-of-sample extensions. The original large kernel matrix cannot fitinto memory. Therefore, we select a smaller subgraph that preserves the overall community structure to construct the model. It makes use of the out-of-sample extension property for community membership of the unseen nodes. We provide a novel memory-and computationally efficient model selection procedure based on angular similarity in the eigenspace. We demonstrate the effectiveness of KSC on large scale synthetic networks and real world networks like the YouTube network, a road network of California and the Livejournal network. These networks contain millions of nodes and several million edges.
Structural health monitoring refers to the process of measuring damagesensitive variables to assess the functionality of a structure. In principle, vibration data can capture the dynamics of the structure and reveal possible failures, but environmental and operational variability can mask this information. Thus, an effective outlier detection algorithm can be applied only after having performed data normalization (i.e. filtering) to eliminate external influences. Instead, in this article we propose a technique which unifies the data normalization and damage detection steps. The proposed algorithm, called adaptive kernel spectral clustering (AKSC), is initialized and calibrated in a phase when the structure is undamaged. The calibration process is crucial to ensure detection of early damage and minimize the number of false alarms. After the calibration, the method can automatically identify new regimes which may be associated with possible faults. These regimes are discovered by means of two complementary damage (i.e. outlier) indicators. The proposed strategy is validated with a simulated example and with real-life natural frequency data from the Z24 pre-stressed concrete bridge, which was progressively damaged at the end of a one-year monitoring period.
This paper proposes a multiclass semisupervised learning algorithm by using kernel spectral clustering (KSC) as a core model. A regularized KSC is formulated to estimate the class memberships of data points in a semisupervised setting using the one-versus-all strategy while both labeled and unlabeled data points are present in the learning process. The propagation of the labels to a large amount of unlabeled data points is achieved by adding the regularization terms to the cost function of the KSC formulation. In other words, imposing the regularization term enforces certain desired memberships. The model is then obtained by solving a linear system in the dual. Furthermore, the optimal embedding dimension is designed for semisupervised clustering. This plays a key role when one deals with a large number of clusters.
This paper is related to community detection in complex networks. We show the use of kernel spectral clustering for the analysis of unweighted networks. We employ the primaldual framework and make use of out-of-sample extension. In the latter the assignement rule for the new nodes is based on a model learned in the training phase. We propose a method to extract from a network a small subgraph representative for its overall community structure. We use a model selection procedure based on the modularity statistic which is novel, because modularity is commonly used only at a training level. We demonstrate the effectiveness of our model on synthetic networks and benchmark data from real networks (power grid network and protein interaction network of yeast). Finally, we compare our model with the Nyström method, showing that our approach is better in terms of quality of the discovered partitions and needs less computation time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.