2021
DOI: 10.1371/journal.pone.0254377
|View full text |Cite
|
Sign up to set email alerts
|

Least-squares community extraction in feature-rich networks using similarity data

Abstract: We explore a doubly-greedy approach to the issue of community detection in feature-rich networks. According to this approach, both the network and feature data are straightforwardly recovered from the underlying unknown non-overlapping communities, supplied with a center in the feature space and intensity weight(s) over the network each. Our least-squares additive criterion allows us to search for communities one-by-one and to find each community by adding entities one by one. A focus of this paper is that the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 41 publications
0
0
0
Order By: Relevance
“…We applied the mechanism proposed in [66], which was later used in various papers, for instance [67][68][69], to generate our synthetic data. In this mechanism, first, we needed to determine the number of data points N, clusters K, and features V. Next, the clusters' cardinalities were determined randomly with two constraints: (a) no cluster should contain less than a pre-specified number of data points (we set this number to 30 in our experiments), and (b) the number of data points in all clusters should sum to N. Once the cluster cardinalities were determined, we generated each cluster from a multivariate normal distribution whose covariance matrix was diagonal with diagonal values derived uniformly at random from the range [0.05, 0.1]-they specify the cluster's spread.…”
Section: Synthetic Datamentioning
confidence: 99%
See 1 more Smart Citation
“…We applied the mechanism proposed in [66], which was later used in various papers, for instance [67][68][69], to generate our synthetic data. In this mechanism, first, we needed to determine the number of data points N, clusters K, and features V. Next, the clusters' cardinalities were determined randomly with two constraints: (a) no cluster should contain less than a pre-specified number of data points (we set this number to 30 in our experiments), and (b) the number of data points in all clusters should sum to N. Once the cluster cardinalities were determined, we generated each cluster from a multivariate normal distribution whose covariance matrix was diagonal with diagonal values derived uniformly at random from the range [0.05, 0.1]-they specify the cluster's spread.…”
Section: Synthetic Datamentioning
confidence: 99%
“…The Min-Max method standardizes the data point xiv = x iv −x va x vb −x va such that xiv ∈ [0, 1]. Although it was empirically shown, for instance in [67], that the clustering result might differ depending on how the data were standardized, due to the intuitiveness of the output of the Min-Max technique, we chose it as the default for our experiments.…”
Section: Data Pre-processingmentioning
confidence: 99%