Randomness extraction involves the processing of purely classical information and is therefore usually studied in the framework of classical probability theory. However, such a classical treatment is generally too restrictive for applications where side information about the values taken by classical random variables may be represented by the state of a quantum system. This is particularly relevant in the context of cryptography, where an adversary may make use of quantum devices. Here, we show that the well known construction paradigm for extractors proposed by Trevisan is sound in the presence of quantum side information.We exploit the modularity of this paradigm to give several concrete extractor constructions, which, e.g., extract all the conditional (smooth) min-entropy of the source using a seed of length poly-logarithmic in the input, or only require the seed to be weakly random.
In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [SODA 2008]; it uses exp(O (n 1/2)) samples and running time for any xed 0 < δ < 1. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least n Ω(log n) samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < δ < 1, we give a mean-based algorithm that uses exp(O (n 1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n 1/3)) traces. In fact, we obtain matching upper and lower bounds even for δ subconstant and ρ 1 − δ subconstant: when (log 3 n)/n δ ≤ 1/2 the bound is exp(−Θ(δn) 1/3), and when 1/ √ n ρ ≤ 1/2 the bound is exp(−Θ(n/ρ) 1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-ips in addition to deletions. We also nd a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.
This is a paper about private data analysis, in which a trusted curator holding a confidential database responds to real vector-valued queries. A common approach to ensuring privacy for the database elements is to add appropriately generated random noise to the answers, releasing only these noisy responses. A line of study initiated in [DN03] examines the amount of distortion needed to prevent privacy violations of various kinds. The results in the literature vary according to several parameters, including the size of the database, the size of the universe from which data elements are drawn, the "amount" of privacy desired, and for the purposes of the current work, the arity of the query. In this paper we sharpen and unify these bounds. Our foremost result combines the techniques of Hardt and Talwar [HT10] and McGregor et al. [MMP+ 10] to obtain linear lower bounds on distortion when providing differential privacy for a (contrived) class of low-sensitivity queries. (A query has low sensitivity if the data of a single individual has small effect on the answer.) Several structural results follow as immediate corollaries:• We separate so-called counting queries from arbitrary low-sensitivity queries, proving the latter requires more noise, or distortion, than does the former; • We separate (ε, 0)-differential privacy from its well-studied relaxation (ε, δ)-differential privacy, even when δ ∈ 2 −o(n) is negligible in the size n of the database, proving the latter requires less distortion than the former;• We demonstrate that (ε, δ)-differential privacy is much weaker than (ε, 0)-differential privacy in terms of mutual information of the transcript of the mechanism with the database, even when δ ∈ 2 −o(n) is negligible in the size n of the database. We also simplify the lower bounds on noise for counting queries in [HT10] and also make them unconditional. Further, we use a characterization of (ǫ, δ) differential privacy from [MMP + 10] to obtain lower bounds on the distortion needed to ensure (ε, δ)-differential privacy for ǫ, δ > 0.Next, we revisit the LP decoding argument of [DMT07] and combine it with recent results of Rudelson [Rud11] to show that for some specific η > 0, if the ℓ-way marginals are released such that at least 1 − ǫ fraction of the entries have o( √ n) noise, then a very minimal notion of privacy called attribute privacy is violated. This improves on a recent result of Kasiviswanathan et al.[KRSU10] where the same conclusion was shown assuming that all the entries have o( √ n)noise. Finally, we extend the original lower bound of [DN03] on the noise required to prevent blatant non-privacy to the case when the universe size is smaller than the size of the database. As we show, the lower bound on the noise required to prevent blatant non-privacy becomes larger as the size of the universe decreases.
The Chow parameters of a Boolean function f : {−1, 1} n → {−1, 1} are its n + 1 degree-0 and degree-1 Fourier coefficients. It has been known since 1961 [Chow 1961;Tannenbaum 1961] that the (exact values of the) Chow parameters of any linear threshold function f uniquely specify f within the space of all Boolean functions, but until recently [O'Donnell and Servedio 2011] nothing was known about efficient algorithms for reconstructing f (exactly or approximately) from exact or approximate values of its Chow parameters. We refer to this reconstruction problem as the Chow Parameters Problem.Our main result is a new algorithm for the Chow Parameters Problem which, given (sufficiently accurate approximations to) the Chow parameters of any linear threshold function f , runs in timeÕ(n 2 ) · (1/ ) O(log 2 (1/ )) and with high probability outputs a representation of an LTF f that is -close to f in Hamming distance. The only previous algorithm [O'Donnell and Servedio 2011] had running time poly(n) · 2 2Õ (1/ 2 ) .As a byproduct of our approach, we show that for any linear threshold function f over {−1, 1} n , there is a linear threshold function f which is -close to f and has all weights that are integers of magnitude at most √ n · (1/ ) O(log 2 (1/ )) . This significantly improves the previous best result of Diakonikolas and Servedio [2009] which gave a poly(n) · 2Õ (1/ 2/3 ) weight bound, and is close to the known lower bound of max{ √ n, (1/ ) (log log(1/ )) } [Goldberg 2006;Servedio 2007]. Our techniques also yield improved algorithms for related problems in learning theory.In addition to being significantly stronger than previous work, our results are obtained using conceptually simpler proofs. The two main ingredients underlying our results are (1) a new structural result showing that for f any linear threshold function and g any bounded function, if the Chow parameters of f are close to the Chow parameters of g then f is close to g; (2) a new boosting-like algorithm that given approximations to the Chow parameters of a linear threshold function outputs a bounded function whose Chow parameters are close to those of f .
This paper studies the problem of learning "low-complexity" probability distributions over the Boolean hypercube {−1, 1} n . As in the standard PAC learning model, a learning problem in our framework is defined by a class C of Boolean functions over {−1, 1} n , but in our model the learning algorithm is given uniform random satisfying assignments of an unknown f ∈ C and its goal is to output a high-accuracy approximation of the uniform distribution over f −1 (1). This distribution learning problem may be viewed as a demanding variant of standard Boolean function learning, where the learning algorithm only receives positive examples and -more importantly -must output a hypothesis function which has small multiplicative error (i.e. small error relative to the size of f −1 (1)).As our main results, we show that the two most widely studied classes of Boolean functions in computational learning theorylinear threshold functions and DNF formulas -have efficient distribution learning algorithms in our model. Our algorithm for linear threshold functions runs in time poly(n, 1/ ) and our algorithm for polynomial-size DNF runs in time quasipoly(n, 1/ ). We obtain both these results via a general approach that combines a broad range of technical ingredients, including the complexitytheoretic study of approximate counting and uniform generation; the Statistical Query model from learning theory; and hypothesis testing techniques from statistics. A key conceptual and technical ingredient of this approach is a new kind of algorithm which we devise called a "densifier" and which we believe may be useful in other contexts.We also establish limitations on efficient learnability in our model by showing that the existence of certain types of cryptographic signature schemes imply that certain learning problems in our framework are computationally hard. Via this connection we show that assuming the existence of sufficiently strong unique signature schemes, there are no sub-exponential time learning algorithms in our framework for intersections of two halfspaces, for degree-2 polynomial threshold functions, or for monotone 2-CNF formulas. Thus our positive results for distribution learning come close to the limits of what can be achieved by efficient algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.