Mathias Bæk Tejs Knudsen scite author profile

The longest common extension problem (LCE problem) is to construct a data structure for an input string T of length n that supports LCE(i, j) queries. Such a query returns the length of the longest common prefix of the suffixes starting at positions i and j in T . This classic problem has a well-known solution that uses O(n) space and O(1) query time. In this paper we show that for any trade-off parameter 1 ≤ τ ≤ n, the problem can be solved in O( n τ ) space and O(τ ) query time. This significantly improves the previously best known time-space trade-offs, and almost matches the best known time-space product lower bound.

show abstract

Optimal Induced Universal Graphs and Adjacency Labeling for Trees

Alstrup

Dahlgaard

Knudsen

2017

J. ACM

View full text Add to dashboard Cite

We show that there exists a graph G with Opnq nodes, such that any forest of n nodes is a node-induced subgraph of G. Furthermore, for constant arboricity k, the result implies the existence of a graph with Opn k q nodes that contains all n-node graphs of arboricity k as node-induced subgraphs, matching a Ωpn k q lower bound. The lower bound and previously best upper bounds were presented in Alstrup and Rauhe [FOCS'02]. Our upper bounds are obtained through a log 2 n`Op1q labeling scheme for adjacency queries in forests.We

show abstract

Hashing for Statistics over K-Partitions

Dahlgaard

Knudsen

Rotenberg

et al. 2015

View full text Add to dashboard Cite

In this paper we analyze a hash function for k-partitioning a set into bins, obtaining strong concentration bounds for standard algorithms combining statistics from each bin.This generic method was originally introduced by Flajolet and Martin [FOCS'83] in order to save a factor Ω(k) of time per element over k independent samples when estimating the number of distinct elements in a data stream. It was also used in the widely used HyperLogLog algorithm of Flajolet et al. [AOFA'97] and in large-scale machine learning by Li et al. [NIPS'12] for minwise estimation of set similarity.The main issue of k-partition, is that the contents of different bins may be highly correlated when using popular hash functions. This means that methods of analyzing the marginal distribution for a single bin do not apply. Here we show that a tabulation based hash function, mixed tabulation, does yield strong concentration bounds on the most popular applications of k-partitioning similar to those we would get using a truly random hash function. The analysis is very involved and implies several new results of independent interest for both simple and double tabulation, e.g. a simple and efficient construction for invertible bloom filters and uniform hashing on a given set.

show abstract

Finding even cycles faster via capped k-walks

Dahlgaard

Knudsen

Stöckel

2017

View full text Add to dashboard Cite

Finding cycles in graphs is a fundamental problem in algorithmic graph theory. In this paper, we consider the problem of finding and reporting a cycle of length 2k in an undirected graph G with n nodes and m edges for constant k ě 2. A classic result by Bondy and Simonovits [J. Combinatorial Theory, 1974] We present an algorithm that uses O`m 2k{pk`1q˘t ime and finds a 2k-cycle if one exists. This bound is Opn 2 q exactly when m " Θpn 1`1{k q. When finding 4-cycles our new bound coincides with Alon et. al., while for every k ą 2 our new bound yields a polynomial improvement in m.Yuster and Zwick noted that it is "plausible to conjecture that Opn 2 q is the best possible bound in terms of n". We show "conditional optimality": if this hypothesis holds then our Opm 2k{pk`1q q algorithm is tight as well. Furthermore, a folklore reduction implies that no combinatorial algorithm can determine if a graph contains a 6-cycle in time Opm 3{2´ε q for any ε ą 0 unless boolean matrix multiplication can be solved combinatorially in time Opn 3´ε 1 q for some ε 1 ą 0, which is widely believed to be false. Coupled with our main result, this gives tight bounds for finding 6-cycles combinatorially and also separates the complexity of finding 4-and 6-cycles giving evidence that the exponent of m in the running time should indeed increase with k.The key ingredient in our algorithm is a new notion of capped k-walks, which are walks of length k that visit only nodes according to a fixed ordering. Our main technical contribution is an involved analysis proving several properties of such walks which may be of independent interest.

show abstract

The Power of Two Choices with Simple Tabulation

Dahlgaard¹,

Knudsen²,

Rotenberg³

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.