In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string x given random "traces" of x where each trace is generated by deleting each coordinate of x independently with probability p < 1. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. Perhaps our most surprising results are: 1. We prove that exp(O(n 1/4 √ log n)) traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown √ n × √ n matrix is deleted independently with probability p. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is exp(O(n 1/3)). 2. An optimal result for random matrix reconstruction: we show that Θ(log n) traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a superlogarithmic lower bound and the best known upper bound is exp(O(log 1/3 n)). 3. We show that exp(O(k 1/3 log 2/3 n)) traces suffice to reconstruct k-sparse strings, providing an improvement over the best known sequence reconstruction results when k = o(n/ log 2 n). 4. We show that poly(n) traces suffice if x is k-sparse and we additionally have a "separation" promise, specifically that the indices of 1's in x all differ by Ω(k log n).
To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model generalizes the random geometric graphs in the same way that the well-studied stochastic block model generalizes the Erdös-Renyi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancement in community detection. While being a topic of fundamental theoretical interest, our main contribution is to show that many practical community structures are better explained by the geometric block model. We also show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. Indeed, even in the regime where the average degree of the graph grows only logarithmically with the number of vertices (sparse-graph), we show that this algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.