2017
DOI: 10.1093/biomet/asx008
|View full text |Cite
|
Sign up to set email alerts
|

Covariate-assisted spectral clustering

Abstract: SummaryBiological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
131
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 121 publications
(135 citation statements)
references
References 28 publications
2
131
0
Order By: Relevance
“…The efficiency of the proposed model is demonstrated on the state-of-the-art community detection methods by considering the feature based methods and structure based ones'. On state-of-the-art feature based methods, Cesna [25], JCDC [35], NC [22], BAGC [26], SDP [36], and CASC [37] are employed in the experiments. On state-of-the-art structure based methods, BigClam [14], Fast-Greedy [13], Infomap [12], Louvain [9], and COMBO [15] are applied in the experiments.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The efficiency of the proposed model is demonstrated on the state-of-the-art community detection methods by considering the feature based methods and structure based ones'. On state-of-the-art feature based methods, Cesna [25], JCDC [35], NC [22], BAGC [26], SDP [36], and CASC [37] are employed in the experiments. On state-of-the-art structure based methods, BigClam [14], Fast-Greedy [13], Infomap [12], Louvain [9], and COMBO [15] are applied in the experiments.…”
Section: Methodsmentioning
confidence: 99%
“…The modified stochastic block model aligned with the features is modified in [22] to reveal the efficacy of each feature on community structures by employing the Expectation-Maximization inference stage. On the one hand, most of the model-free feature based methods suffer from the dependency to multiple tuning parameters such as JCDC [35], and CASC [37]. On the other hand, the generative feature based models on extraction of communities have some problems including the model sensitivity on the presumed graphical representation of the features and communities CESNA [25], and modeling a correlation of single feature with the community structure at a time [22].…”
Section: Related Workmentioning
confidence: 99%
“…A particularly popular version of these models assumes that the probability of connections between a pair of nodes is equal to the dot product between the nodes' latent positions [89][90][91]. In these models, an extensive set of theoretical investigations have established the kinds of claims we desire when using a statistical model to make inferences about our data [92,93], as well as a number of extensions, including a generalized random dot product [94], a random dot product with node-wise covariates [95], and a latent structure model [96] (for review, see [97]). However, these models typically only operate on single, unweighted networks lacking attributes.…”
Section: Statistical Models Of Connectomesmentioning
confidence: 99%
“…Recent developments suggest that using node features or covariates can greatly improve classification accuracy. For example, Binkiewicz et al (2017) add the covariance XX , with X ∈ [−J, J] R being the node covariate matrix, to the regularized graph Laplacian and perform the spectral clustering on the static similarity matrix. We extend the static similarity matrix to cover the dynamic case below:…”
Section: Undirected Networkmentioning
confidence: 99%
“…To establish the consistency of the CASC for the dynamic SC-DCBM, we need to determine the upper bounds for the misclustering rates. Following Binkiewicz et al (2017), we denote C i,t and C i,t as the cluster centroids of the ith node at time t generated using kmedians clustering on the sample eigenvector U t and the population U t , respectively. Then, we define the set of mis-clustered nodes at each period as…”
Section: Undirected Casementioning
confidence: 99%