The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer science literature. In this survey paper, we attempt to review the state-of-the-art of statistical inference under memory constraints in several canonical problems, including hypothesis testing, parameter estimation, and distribution property testing/estimation. We discuss the main results in this developing field, and by identifying recurrent themes, we extract some fundamental building blocks for algorithmic construction, as well as useful techniques for lower bound derivations.
In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let X1, X2, . . . be a sequence of independent identically distributed Bernoulli random variables, with expectation p under H0 and q under H1. Consider a finite-memory deterministic machine with S states that updates its state Mn ∈ {1, 2, . . . , S} at each time according to the rule Mn = f (Mn−1, Xn), where f is a deterministic timeinvariant function. Assume that we let the process run for a very long time (n → ∞), and then make our decision according to some mapping from the state space to the hypothesis space. The main contribution of this paper is a lower bound on the Bayes error probability Pe of any such machine. In particular, our findings show that the ratio between the maximal exponential decay rate of Pe with S for a deterministic machine and for a randomized one, can become unbounded, complementing a result by Hellman.P e (f, d),
In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let X 1 , X 2 , . . . be a sequence of independent identically distributed Bernoulli random variables with expectation θ, where θ ∈ [0, 1]. Consider a finite-memory deterministic machine with S states, that updates its state M n ∈ {1, 2, . . . , S} at each time according to the rulewhere f is a deterministic time-invariant function. Assume that the machine outputs an estimate at each time point according to some fixed mapping from the state space to the unit interval. The quality of the estimation procedure is measured by the asymptotic risk, which is the long-term average of the instantaneous quadratic risk. The main contribution of this paper is an upper bound on the smallest worst-case asymptotic risk any such machine can attain. This bound coincides with a lower bound derived by Leighton and Rivest, to imply that Θ(1/S) is the minimax asymptotic risk for deterministic S-state machines. In particular, our result disproves a longstanding Θ(log S/S) conjecture for this quantity, also posed by Leighton and Rivest.
We consider the problem of distributed source simulation with no communication, in which Alice and Bob observe sequences U n and V n respectively, drawn from a joint distribution p ⊗n U V , and wish to locally generate sequences X n and Y n respectively with a joint distribution that is close (in KL divergence) to p ⊗n XY . We provide a single-letter condition under which such a simulation is asymptotically possible with a vanishing KL divergence. Our condition is nontrivial only in the case where the Gàcs-Körner (GK) common information between U and V is nonzero, and we conjecture that only scalar Markov chains X − U − V − Y can be simulated otherwise. Motivated by this conjecture, we further examine the case where both pUV and pXY are doubly symmetric binary sources with parameters p, q ≤ 1/2 respectively. While it is trivial that in this case p ≤ q is both necessary and sufficient, we show that when p is close to q then any successful simulation is close to being scalar in the total variation sense. Figure 2: Digital solutionThis digital approach is viable only when C GK (U ; V ) > 0. There is an even simpler analog approach that does not use common information -Alice and Bob pass their corresponding sequences through memoryless channels p X n |U n = p ⊗n X|U and p Y n |V n = p ⊗n Y |V , respectively, symbol-by-symbol.Proposition 2 (analog solution). If X − U − V − Y form a Markov chain, then p XY is simulable from p UV .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.