In a Private Information Retrieval (PIR) protocol, a user can download a file from a database without revealing the identity of the file to each individual server. A PIR protocol is called t-private if the identity of the file remains concealed even if t of the servers collude. Graph based replication is a simple technique, which is prevalent in both theory and practice, for achieving erasure robustness in storage systems. In this technique each file is replicated on two or more storage servers, giving rise to a (hyper-)graph structure. In this paper we study private information retrieval protocols in graph based replication systems. The main interest of this work is maximizing the parameter t, and in particular, understanding the structure of the colluding sets which emerge in a given graph. Our main contribution is a 2-replication scheme which guarantees perfect privacy from acyclic sets in the graph, and guarantees partialprivacy in the presence of cycles. Furthermore, by providing an upper bound, it is shown that the PIR rate of this scheme is at most a factor of two from its optimal value for an important family of graphs. Lastly, we extend our results to larger replication factors and to graph-based coding, which is a similar technique with smaller storage overhead and larger PIR rate.Parts of this work were presented at the International Symposium on Information Theory (ISIT), Vail, CO, USA, 2018. 1 In some settings, only computational privacy is required, but this paper focus exclusively on perfect privacy.
,which naturally corresponds to the setsAs another example, in which N − K ≥ K, we may consider the following.Example 18. Assume that N − K = 6 and K = 4, which implies that r = 2 and b = 3. Consider the following matrix 1 1 1 1 2 2 2 2 3 3 3 3 which naturally corresponds to the sets J (1,1) = {1, 2, 3, 4} J (2,1) = ∅ J (1,2) = {5, 6} J (2,2) = {1, 2} J (1,3) = ∅ J (2,3) = {3, 4, 5, 6}.