We consider the community detection problem in a sparse q-uniform hypergraph G, assuming that G is generated according to the so-called Hypergraph Stochastic Block Model (HSBM). We prove that a spectral method based on the non-backtracking operator for hypergraphs works with high probability down to the generalized Kesten-Stigum detection threshold conjectured by Angelini et al. in [12]. We characterize the spectrum of the non-backtracking operator for the sparse HSBM, and provide an efficient dimension reduction procedure using the Ihara-Bass formula for hypergraphs. As a result, community detection for the sparse HSBM on n vertices can be reduced to an eigenvector problem of a 2n × 2n non-normal matrix constructed from the adjacency matrix and the degree matrix of the hypergraph. To the best of our knowledge, this is the first provable and efficient spectral algorithm that achieves the conjectured threshold for HSBMs with k blocks generated according to a general symmetric probability tensor.
Whenever immersed in seawater after a collier accident, a fossil fuel such as coal could become a source of pollution to the marine environment. To study the effect of such a contamination, four coal samples from different origins were used. A first analysis on those coals enabled us to determine the content of polycyclic aromatic hydrocarbons. Seawater was then mixed with coal to study the organic matter released from coal into seawater. Fluorescence was used for its sensitivity to aromatic compounds, with the additional purpose of evaluating the relevance of using an immersed fluorescence probe to monitor water pollution. Excitation-emission matrices were recorded and the excitation-emission wavelength range corresponding to the highest fluorescence intensity was 230 nm/[370 nm; 420 nm]. The samples with coal happened to fluoresce more than the coal-free samples, the difference depending on the coal origin. The fluorescence intensity increased with coal mass, up to some limit. The particle size also influenced the fluorescence intensity, the finest particles releasing more fluorescing substances, due to their higher exchange surface. When seawater percolated through coal, the samples fluoresced highly at the beginning, and then the fluorescence intensity decreased and reached the seawater level. However, even with a 10 ns acquisition time shift, the fluorescence spectra were not specific enough to show the presence of PAHs in the samples, which were too diluted to be detected, whenever released from coal into seawater. The lifetimes of the seawater and of the coal samples were respectively 4.7 and 3.8 ns, indicating that the substances released from coal mainly consisted of short-lived fluorescing substances, such as natural humic or fulvic substances. Consequently, the presence of coal does not seem to be too detrimental to the marine environment, and a direct fluorescence probe could be used to monitor the seawater organic charge increase due to the immersion of coal in seawater.
The present work is concerned with community detection. Specifically, we consider a random graph drawn according to the stochastic block model: its vertex set is partitioned into blocks, or communities, and edges are placed randomly and independently of each other with probability depending only on the communities of their two endpoints. In this context, our aim is to recover the community labels better than by random guess, based only on the observation of the graph.In the sparse case, where edge probabilities are in O(1/n), we introduce a new spectral method based on the distance matrix D (ℓ) , where D (ℓ) ij = 1 iff the graph distance between i and j, noted d(i, j) is equal to ℓ. We show that when ℓ ∼ c log(n) for carefully chosen c, the eigenvectors associated to the largest eigenvalues of D (ℓ) provide enough information to perform non-trivial community recovery with high probability, provided we are above the so-called Kesten-Stigum threshold. This yields an efficient algorithm for community detection, since computation of the matrix D (ℓ) can be done in O(n 1+κ ) operations for a small constant κ.We then study the sensitivity of the eigendecomposition of D (ℓ) when we allow an adversarial perturbation of the edges of G. We show that when the considered perturbation does not affect more than O(n ε ) vertices for some small ε > 0, the highest eigenvalues and their corresponding eigenvectors incur negligible perturbations, which allows us to still perform efficient recovery.Our proposed spectral method therefore: i) is robust to larger perturbations than prior spectral methods, while semi-definite programming (or SDP) methods can tolerate yet larger perturbations; ii) achieves non-trivial detection down to the KS threshold, which is conjectured to be optimal and is beyond reach of existing SDP approaches; iii) is faster than SDP approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.