We investigate the adversarial robustness of streaming algorithms. In this context, an algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. While deterministic streaming algorithms are inherently robust, many central problems in the streaming literature do not admit sublinear-space deterministic algorithms; on the other hand, classical space-efficient randomized algorithms for these problems are generally not adversarially robust. This raises the natural question of whether there exist efficient adversarially robust (randomized) streaming algorithms for these problems.
In this paper, we resolve the one-pass space complexity of perfect L p sampling for p ∈ (0, 2) in a stream. Given a stream of updates (insertions and deletions) to the coordinates of an underlying vector f ∈ R n , a perfect L p sampler must output an index i with probability |f i | p / f p p , and is allowed to fail with some probability δ. So far, for p > 0 no algorithm has been shown to solve the problem exactly using poly(log n)-bits of space. In 2010, Monemizadeh and Woodruff introduced an approximate L p sampler, which outputs i with probability (1 ± ν)|f i | p / f p p , using space polynomial in ν −1 and log(n). The space complexity was later reduced by Jowhari, Saglam, and Tardos to roughly O(ν −p log 2 n log δ −1 ) for p ∈ (0, 2), which matches the Ω(log 2 n log δ −1 ) lower bound in terms of n and δ, but is loose in terms of ν.Given these nearly tight bounds, it is perhaps surprising that no lower bound exists in terms of ν-not even a bound of Ω(ν −1 ) is known. In this paper, we explain this phenomenon by demonstrating the existence of an O(log 2 n log δ −1 )-bit perfect L p sampler for p ∈ (0, 2). This shows that ν need not factor into the space of an L p sampler, which closes the complexity of the problem for this range of p. For p = 2, our bound is O(log 3 n log δ −1 )-bits, which matches the prior best known upper bound of O(ν −2 log 3 n log δ −1 ), but has no dependence on ν. For p < 2, our bound holds in the random oracle model, matching the lower bounds in that model. Moreover, we show that our algorithm can be derandomized with only a O((log log n) 2 ) blowup in the space (and no blow-up for p = 2). Our derandomization technique is quite general, and can be used to derandomize a large class of linear sketches, including the more accurate count-sketch variant of [MP14], resolving an open question in that paper.Finally, we show that a (1±ǫ) relative error estimate of the frequency f i of the sampled index i can be obtained using an additional O(ǫ −p log n)-bits of space for p < 2, and O(ǫ −2 log 2 n) bits for p = 2, which was possible before only by running the prior algorithms with ν = ǫ.
Two prevalent models in the data stream literature are the insertion-only and turnstile models. Unfortunately, many important streaming problems require a Θ(log(n)) multiplicative factor more space for turnstile streams than for insertion-only streams. This complexity gap often arises because the underlying frequency vector f is very close to 0, after accounting for all insertions and deletions to items. Signal detection in such streams is difficult, given the large number of deletions.In this work, we propose an intermediate model which, given a parameter α ≥ 1, lower bounds the norm f p by a 1/α-fraction of the L p mass of the stream had all updates been positive. Here, for a vector f , f p = ( n i=1 |f i | p ) 1/p , and the value of p we choose depends on the application. This gives a fluid medium between insertion only streams (with α = 1), and turnstile streams (with α = poly(n)), and allows for analysis in terms of α.We show that for streams with this α-property, for many fundamental streaming problems we can replace a O(log(n)) factor in the space usage for algorithms in the turnstile model with a O(log(α)) factor. This is true for identifying heavy hitters, inner product estimation, L 0 estimation, L 1 estimation, L 1 sampling, and support sampling. For each problem, we give matching or nearly matching lower bounds for α-property streams. We note that in practice, many important turnstile data streams are in fact α-property streams for small values of α. For such applications, our results represent significant improvements in efficiency for all the aforementioned problems.
We study two simple yet general complexity classes, which provide a unifying framework for efficient query evaluation in areas like graph databases and information extraction, among others. We investigate the complexity of three fundamental algorithmic problems for these classes: enumeration, counting and uniform generation of solutions, and show that they have several desirable properties in this respect. Both complexity classes are defined in terms of non deterministic logarithmic-space transducers (NL transducers). For the first class, we consider the case of unambiguous NL transducers, and we prove constant delay enumeration, and both counting and uniform generation of solutions in polynomial time. For the second class, we consider unrestricted NL transducers, and we obtain polynomial delay enumeration, approximate counting in polynomial time, and polynomialtime randomized algorithms for uniform generation. More specifically, we show that each problem in this second class admits a fully polynomial-time randomized approximation scheme (FPRAS) and a polynomial-time Las Vegas algorithm (with preprocessing) for uniform generation. Remarkably, the key idea to prove these results is to show that the fundamental problem #NFA admits an FPRAS, where #NFA is the problem of counting the number of strings of length n (given in unary) accepted by a non-deterministic finite automaton (NFA). While this problem is known to be #P-complete and, more precisely, SpanL-complete, it was open whether this problem admits an FPRAS. In this work, we solve this open problem, and obtain as a welcome corollary that every function in SpanL admits an FPRAS.
We consider the following fundamental problem in the study of neural networks: given input examples x ∈ R d and their vector-valued labels, as defined by an underlying generative neural network, recover the weight matrices of this network. We consider two-layer networks, mapping R d to R m , with a single hidden layer and k non-linear activation units f (•), where f (x) = max{x, 0} is the ReLU activation function. Such a network is specified by two weight matrices, U * ∈ R m×k , V * ∈ R k×d , such that the label of an example x ∈ R d is given by U * f (V * x), where f (•) is applied coordinate-wise. Given n samples x 1 , . . . , x n ∈ R d as a matrix X ∈ R d×n and the label U * f (V * X) of the network on these samples, our goal is to recover the weight matrices U * and V * . More generally, our labels U * f (V * X) may be corrupted by noise, and instead we observe U * f (V * X) + E where E is some noise matrix. Even in this case, we may still be interested in recovering good approximations to the weight matrices U * and V * .In this work, we develop algorithms and hardness results under varying assumptions on the input and noise. Although the problem is NP-hard even for k = 2, by assuming Gaussian marginals over the input X we are able to develop polynomial time algorithms for the approximate recovery of U * and V * . Perhaps surprisingly, in the noiseless case our algorithms recover U * , V * exactly, i.e., with no error. To the best of the our knowledge, this is the first algorithm to accomplish exact recovery for the ReLU activation function. For the noisy case, we give the first polynomial time algorithm that approximately recovers the weights in the presence of mean-zero noise E. Our algorithms generalize to a larger class of rectified activation functions, f (x) = 0 when x ≤ 0, and f (x) > 0 otherwise. Although our polynomial time results require U * to have full column rank, we also give a fixed-parameter tractable algorithm (in k) when U * does not have this property. Lastly, we give a fixed-parameter tractable algorithm for more arbitrary noise matrices E, so long as they are independent of X.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.