International audience This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.
An m-ballot path of size n is a path on the square grid consisting of north and east steps, starting at (0, 0), ending at (mn, n), and never going below the line {x = my}. The set of these paths can be equipped with a lattice structure, called the m-Tamari lattice and denoted by T (m) n , which generalizes the usual Tamari lattice Tn obtained when m = 1. We prove that the number of intervals in this lattice is m + 1 n(mn + 1) (m + 1) 2 n + m n − 1 .This formula was recently conjectured by Bergeron in connection with the study of diagonal coinvariant spaces. The case m = 1 was proved a few years ago by Chapoton. Our proof is based on a recursive description of intervals, which translates into a functional equation satisfied by the associated generating function. The solution of this equation is an algebraic series, obtained by a guess-and-check approach. Finding a bijective proof remains an open problem.
Boltzmann models from statistical physics combined with methods from analytic combinatorics give rise to efficient algorithms for the random generation of unlabelled objects. The resulting algorithms generate in an unbiased manner discrete configurations that may have nontrivial symmetries, and they do so by means of real-arithmetic computations. We present a collection of construction rules for such samplers, which applies to a wide variety of combinatorial classes, including integer partitions, necklaces, unlabelled functional graphs, dictionaries, series-parallel circuits, term trees and acyclic molecules obeying a variety of constraints, and so on. Under an abstract real-arithmetic computation model, the algorithms are, for many classical structures, of linear complexity provided a small tolerance is allowed on the size of the object drawn. As opposed to many of their discrete competitors, the resulting programs routinely make it possible to generate random objects of sizes in the range 10 4 -10 6 .
We present a bijection between some quadrangular dissections of an hexagon and unrooted binary trees with interesting consequences for enumeration, mesh compression, and graph sampling. Our bijection yields an efficient uniform random sampler for 3-connected planar graphs, which turns out to be determinant for the quadratic complexity of the current best-known uniform random sampler for labelled planar graphs. It also provides an encoding for the set P(n) of n-edge 3-connected planar graphs that matches the entropy bound 1 n log 2 |P(n)| = 2 + o(1) bits per edge (bpe). This solves a theoretical problem recently raised in mesh compression as these graphs abstract the combinatorial part of meshes with spherical topology. We also achieve the optimal parametric rate 1 n log 2 |P(n, i, j)| bpe for graphs of P(n) with i vertices and j faces, matching in particular the optimal rate for triangulations. Our encoding relies on a linear time algorithm to compute an orientation associated with the minimal Schnyder wood of a 3-connected planar map. This algorithm is of independent interest, and it is, for instance, a key ingredient in a recent straight line drawing algorithm for 3-connected planar graphs. ACM Reference Format:Fusy,É., Schaeffer, G., and Poulalhon, D. 2008. Dissections, orientations, and trees, with applications to optimal mesh encoding and random sampling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.