A 2-server Private Information Retrieval (PIR) scheme allows a user to retrieve the ith bit of an n-bit database replicated among two non-communicating servers, while not revealing any information about i to either server. In this work we construct a 2-server PIR scheme with total communication cost n O log log n log n. This improves over current 2-server protocols which all require Ω(n 1/3 ) communication.Our construction circumvents the n 1/3 barrier of Razborov and Yekhanin [21] which holds for the restricted model of bilinear group-based schemes (covering all previous 2-server schemes). The improvement comes from reducing the number of servers in existing protocols, based on Matching Vector Codes, from 3 or 4 servers to 2. This is achieved by viewing these protocols in an algebraic way (using polynomial interpolation) and extending them using partial derivatives.
give a construction with O(n 3 ) field size, whereas previous constructions needed n Θ(a) field size. Our construction for h = 2 makes the choices r = 3, a = 1, h = 3 the next smallest setting to investigate regarding the existence of MR LRCs over fields of near-linear size. We answer this question in the positive via a novel approach based on elliptic curves and arithmetic progression free sets. 1 The term local reconstruction codes is from [HSX + 12]. Essentially the same codes were called locally repairable codes in [PD14] and locally recoverable codes in [TB14]. Thankfully all names above abbreviate to LRCs.
A 2-server Private Information Retrieval (PIR) scheme allows a user to retrieve the i th bit of an n -bit database replicated among two noncommunicating servers, while not revealing any information about i to either server. In this work, we construct a 2-server PIR scheme with total communication cost n O (√log / log n log n ). This improves over current 2-server protocols, which all require Ω( n 1/3 ) communication. Our construction circumvents the n 1/3 barrier of Razborov and Yekhanin [2007], which holds for the restricted model of bilinear group-based schemes (covering all previous 2-server schemes). The improvement comes from reducing the number of servers in existing protocols, based on Matching Vector Codes, from 3 or 4 servers to 2. This is achieved by viewing these protocols in an algebraic way (using polynomial interpolation) and extending them using partial derivatives.
Motivated by applications in recommender systems, web search, social choice and crowdsourcing, we consider the problem of identifying the set of top K items from noisy pairwise comparisons. In our setting, we are non-actively given r pairwise comparisons between each pair of n items, where each comparison has noise constrained by a very general noise model called the strong stochastic transitivity (SST) model. We analyze the competitive ratio of algorithms for the top-K problem. In particular, we present a linear time algorithm for the top-K problem which has a competitive ratio ofÕ( √ n); i.e. to solve any instance of top-K, our algorithm needs at mostÕ( √ n) times as many samples needed as the best possible algorithm for that instance (in contrast, all previous known algorithms for the top-K problem have competitive ratios ofΩ(n) or worse). We further show that this is tight: any algorithm for the top-K problem has competitive ratio at leastΩ( √ n). * Full version of this paper can be found at https://arxiv.org/abs/1605.03933.
We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially private adaptations of these approaches outperform previous private algorithms in three important dimensions: utility, privacy, and the computational and memory cost of private training. On many commonly studied datasets, the utility of private models approaches that of non-private models. For example, on the MNLI dataset we achieve an accuracy of 87.8% using RoBERTa-Large and 83.5% using RoBERTa-Base with a privacy budget of ε = 6.7. In comparison, absent privacy constraints, RoBERTa-Large achieves an accuracy of 90.2%. Our findings are similar for natural language generation tasks. Privately fine-tuning with DART, GPT-2-Small, GPT-2-Medium, GPT-2-Large, and GPT-2-XL achieve BLEU scores of 38.5, 42.0, 43.1, and 43.8 respectively (privacy budget of ε = 6.8, δ = 1e-5) whereas the non-private baseline is 48.1. All our experiments suggest that larger models are better suited for private fine-tuning: while they are well known to achieve superior accuracy non-privately, we find that they also better maintain their accuracy when privacy is introduced.
Motivated by a problem on random differences in Szemerédi's theorem and another problem on large deviations for arithmetic progressions in random sets, we prove upper bounds on the Gaussian width of special point sets in R k . The point sets are formed by the image of the n-dimensional Boolean hypercube under a mapping ψ : R n → R k , where each coordinate is a constant-degree multilinear polynomial with 0-1 coefficients. We show the following applications of our bounds. Let [Z/N Z] p be the random subset of Z/N Z containing each element independently with probability p.This gives a polynomial improvement for all ℓ ≥ 3 of a previous bound due to Frantzikinakis, Lesigne and Wierdl, and reproves more directly the same improvement shown recently by the authors and Dvir (here we avoid the theories of locally decodable codes and quantum information).• Let X k be the number of k-term arithmetic progressions in [Z/N Z] p and consider the large deviation rate ρ k (δ) = log Pr[X k ≥ (1 + δ)EX k ]. We give quadratic improvements of the best-known range of p for which a highly precise estimate of ρ k (δ) due to Bhattacharya, Ganguly, Shao and Zhao is valid for all odd k ≥ 5. In particular, the estimate holds if p ≥ ω(N −c k log N ) for c k = (6k⌈(k − 1)/2⌉) −1 .We also discuss connections with locally decodable codes and the Banach-space notion of type for injective tensor products of ℓ p -spaces.
In this paper, we study private optimization problems for non-smooth convex functionsWe show that modifying the exponential mechanism by adding an ℓ 2 2 regularizer to F (x) and sampling from π(x) ∝ exp(−k(F (x) + µ x 2 2 /2)) recovers both the known optimal empirical risk and population loss under (ε, δ)-DP. Furthermore, we show how to implement this mechanism using O(n min(d, n)) queries to f i (x) for the DP-SCO where n is the number of samples/users and d is the ambient dimension. We also give a (nearly) matching lower bound Ω(n min(d, n)) on the number of evaluation queries.Our results utilize the following tools that are of independent interest:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.