We present a clustering scheme that combines a mode-seeking phase with a cluster merging phase in the corresponding density map. While mode detection is done by a standard graph-based hill-climbing scheme, the novelty of our approach resides in its use of topological persistence to guide the merging of clusters. Our algorithm provides additional feedback in the form of a set of points in the plane, called a persistence diagram (PD), which provably reflects the prominences of the modes of the density. In practice, this feedback enables the user to choose relevant parameter values, so that under mild sampling conditions the algorithm will output the correct number of clusters, a notion that can be made formally sound within persistence theory. In addition, the output clusters have the property that their spatial locations are bound to the ones of the basins of attraction of the peaks of the density. The algorithm only requires rough estimates of the density at the data points, and knowledge of (approximate) pairwise distances between them. It is therefore applicable in any metric space. Meanwhile, its complexity remains practical: although the size of the input distance matrix may be up to quadratic in the number of data points, a careful implementation only uses a linear amount of memory and takes barely more time to run than to read through the input.
We present a new algorithm for computing zigzag persistent homology, an algebraic structure which encodes changes to homology groups of a simplicial complex over a sequence of simplex additions and deletions. Provided that there is an algorithm that multiplies two n × n matrices in M (n) time, our algorithm runs in O(M (n) + n 2 log 2 n) time for a sequence of n additions and deletions. In particular, the running time is O(n 2.376 ), by result of Coppersmith and Winograd. The fastest previously known algorithm for this problem takes O(n 3 ) time in the worst case.
Given a real-valued function f defined over some metric space X, is it possible to recover some structural information about f from the sole information of its values at a finite set L ⊆ X of sample points, whose pairwise distances in X are given? We provide a positive answer to this question. More precisely, taking advantage of recent advances on the front of stability for persistance diagrams, we introduce a novel algebraic construction, based on a pair of nested families of simplicial complexes built on top of the point cloud L, from which the persistance diagram of f can be faithfully approximated. We derive from this construction a series of algorithms for the analysis of scalar fields from point cloud data. These algorithms are simple and easy to implement, they have reasonable complexities, and they come with theoretical guarantees. To illustrate the genericity and practicality of the approach, we also present some experimental results obtained in various applications, ranging from clustering to sensor networks.Key-words: Persistent homology, Persistence modules, Sampling theory, Vietoris-Rips complexes, Morse theory * frederic.chazal@inria.fr † guibas@cs.stanford.edu ‡ steve.oudot@inria.fr § primoz@stanford.edu Analyse de champs scalaires sur des nuages de pointsRésumé :Étant donné une fonction scalaire f définie sur un espace métrique X, est-il possible d'extraire de l'information sur la structure du graphe de fà partir de la seule donnée de ses valeurs sur un ensemble fini L d'échantillons de X, ainsi que des distances géodésiques entre les points de L ? Cet article répond positivementà cette question. Plus précisément, en nous appuyant sur des résultats récents sur la stabilité des diagrammes de persistance, nous introduisons une nouvelle construction algébrique utilisant une paire de familles de complexes simpliciaux imbriqués,à partir de laquelle le diagramme de persistance de f peutêtre calculé de manière approchée. Nous déduisons de cette construction algébrique une famille d'algorithmes pour l'analyse des champs scalairesà partir de nuages de points. Ces algorithmes sont simples et facilesà implanter, ils ont des complexités raisonnables, ainsi que des garanties théoriques. Afin d'illustrer la généricité de notre approche, nous présentons des résultats expérimentaux obtenus dans diverses applications, comme la classification ou les réseaux de capteurs sans fils.
In this paper we study the connection between the phenomenon of homological percolation (the formation of "giant" cycles in persistent homology), and the zeros of the expected Euler characteristic curve. We perform an experimental study that covers four different models: site-percolation on the cubical and permutahedral lattices, the Poisson-Boolean model, and Gaussian random fields.All the models are generated on the flat torus T d , for d = 2, 3, 4. The simulation results strongly indicate that the zeros of the expected Euler characteristic curve approximate the critical values for homological-percolation. Our results also provide some insight about the approximation error.Further study of this connection could have powerful implications both in the study of percolation theory, and in the field of Topological Data Analysis.
Abstract-We study the distributed desynchronization problem for graphs with arbitrary topology. Motivated by the severe computational limitations of sensor networks, we present a randomized algorithm for network desynchronization that uses an extremely lightweight model of computation, while being robust to link volatility and node failure. These techniques also provide novel, ultra-lightweight randomized algorithms for quickly computing distributed vertex colorings using an asymptotically optimal number of colors.
We present a clustering scheme that combines a mode-seeking phase with a cluster merging phase in the corresponding density map. While mode detection is done by a standard graph-based hill-climbing scheme, the novelty of our approach resides in its use of topological persistence to guide the merging of clusters. Our algorithm provides additional feedback in the form of a set of points in the plane, called a persistence diagram (PD), which provably reflects the prominences of the modes of the density. In practice, this feedback enables the user to choose relevant parameter values, so that under mild sampling conditions the algorithm will output the correct number of clusters, a notion that can be made formally sound within persistence theory.The algorithm only requires rough estimates of the density at the data points, and knowledge of (approximate) pairwise distances between them. It is therefore applicable in any metric space. Meanwhile, its complexity remains practical: although the size of the input distance matrix may be up to quadratic in the number of data points, a careful implementation only uses a linear amount of memory and takes barely more time to run than to read through the input.In this conference version of the paper we emphasize the experimental aspects of our work, describing the approach, giving an intuitive overview of its theoretical guarantees, discussing the choice of its parameters in practice, and demonstrating its potential in terms of applications through a series of experimental results obtained on synthetic and real-life data sets. Precise statements and proofs of our theoretical claims can be found in the full version of the paper [7].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.