Abstract. We present a general framework for private information retrieval (PIR) from arbitrary coded databases that allows one to adjust the rate of the scheme to the suspected number of colluding servers. If the storage code is a generalized Reed-Solomon code of length n and dimension k, we design PIR schemes that achieve a PIR rate of n−(k+t−1) n while protecting against any t colluding servers, for any 1 ≤ t ≤ n − k. This interpolates between the previously studied cases of t = 1 and k = 1 and achieves PIR capacity in both of these cases asymptotically as the number of files in the database grows.Key words. private information retrieval, distributed storage systems, generalized Reed-Solomon codes AMS subject classifications. 68P20, 68P30, 94B27, 14G50 DOI. 10.1137/16M11025621. Introduction. Private information retrieval (PIR) addresses the question of how to retrieve data items from a database without disclosing information about the identity of the data items retrieved, and was introduced by Chor et al. in [4,5]. The classic PIR model of [5] views the database as an m-bit binary string x = [x 1 · · · x m ] ∈ {0, 1} m and assumes that the user wants to retrieve a single bit x i without revealing any information about the index i. We consider a natural extension of this model, wherein the database is a string x = [x 1 · · · x m ] of files x i , which are themselves bit strings, and the user wants to download one of the files x i without revealing its index.The rate of a PIR scheme in this model is measured as the ratio of the gained information over the downloaded information, while upload costs of the requests are usually ignored. The trivial solution is to download the entire database. This, however, incurs a significant communication overhead whenever the database is large and is therefore not useful in practice. While the trivial solution is the only way to guarantee information-theoretic privacy in the case of a single server [5], this problem can be remedied by replicating the database onto n servers that do not communicate.The study of PIR recently received renewed attention when Shah et al. introduced a model of coded private information retrieval (cPIR) [10,11]. Here, all files are distributed over the
The problem of Private Information Retrieval (PIR) from coded storage systems with colluding, byzantine, and unresponsive servers is considered. An explicit scheme using an [n, k] Reed-Solomon storage code is designed, protecting against t-collusion and handling up to b byzantine and r unresponsive servers, when n > k + t + 2b + r − 1. This scheme achieves a PIR rate of n−r−(k+2b+t−1) n−r . In the case where the capacity is known, namely when k = 1, it is asymptotically capacity-achieving as the number of files grows. Lastly, the scheme is adapted to symmetric PIR.
In Private Information Retrieval (PIR), one wants to download a file from a database without revealing to the database which file is being downloaded. Much attention has been paid to the case of the database being encoded across several servers, subsets of which can collude to attempt to deduce the requested file. With the goal of studying the achievable PIR rates in realistic scenarios, we generalize results for coded data from the case of all subsets of servers of size t colluding, to arbitrary subsets of the servers. We investigate the effectiveness of previous strategies in this new scenario, and present new results in the case where the servers are partitioned into disjoint colluding groups.
This paper presents private information retrieval (PIR) schemes for coded storage with colluding servers, which are not restricted to maximum distance separable (MDS) codes. PIR schemes for general linear codes are constructed and the resulting PIR rate is calculated explicitly. It is shown that codes with transitive automorphism groups yield the highest possible rates obtainable with the proposed scheme.This rate coincides with the known asymptotic PIR capacity for MDS-coded storage systems without collusion. While many PIR schemes in the literature require field sizes that grow with the number of servers and files in the system, we focus especially on the case of a binary base field, for which Reed-Muller codes serve as an important and explicit class of examples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.