Salim El Rouayheb scite author profile

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

show abstract

Fractional repetition codes for repair in distributed storage systems

Rouayheb

Ramchandran

2010

184

360

View full text Add to dashboard Cite

We introduce a new class of exact MinimumBandwidth Regenerating (MBR) codes for distributed storage systems, characterized by a low-complexity uncoded repair process that can tolerate multiple node failures. These codes consist of the concatenation of two components: an outer MDS code followed by an inner repetition code. We refer to the inner code as a Fractional Repetition code since it consists of splitting the data of each node into several packets and storing multiple replicas of each on different nodes in the system.Our model for repair is table-based, and thus, differs from the random access model adopted in the literature. We present constructions of Fractional Repetition codes based on regular graphs and Steiner systems for a large set of system parameters. The resulting codes are guaranteed to achieve the storage capacity for random access repair. The considered model motivates a new definition of capacity for distributed storage systems, that we call Fractional Repetition capacity. We provide upper bounds on this capacity, while a precise expression remains an open problem.

show abstract

On the Index Coding Problem and Its Relation to Network Coding and Matroid Theory

Rouayheb

Sprintson

Georghiades

2010

IEEE Trans. Inform. Theory

279

306

View full text Add to dashboard Cite

The index coding problem has recently attracted a significant attention from the research community due to its theoretical significance and applications in wireless ad-hoc networks. An instance of the index coding problem includes a sender that holds a set of information messages X = {x1, . . . , x k } and a set of receivers R. Each receiver ρ = (x, H) ∈ R needs to obtain a message x ∈ X and has prior side information comprising a subset H of X. The sender uses a noiseless communication channel to broadcast encoding of messages in X to all clients. The objective is to find an encoding scheme that minimizes the number of transmissions required to satisfy the receivers' demands with zero error.In this paper, we analyze the relation between the index coding problem, the more general network coding problem and the problem of finding a linear representation of a matroid. In particular, we show that any instance of the network coding and matroid representation problems can be efficiently reduced to an instance of the index coding problem. Our reduction implies that many important properties of the network coding and matroid representation problems carry over to the index coding problem. Specifically, we show that vector linear codes outperform scalar linear codes and that vector linear codes are insufficient for achieving the optimum number of transmissions.

show abstract

Securing Dynamic Distributed Storage Systems Against Eavesdropping and Adversarial Attacks

Pawar

Rouayheb

Ramchandran

2011

IEEE Trans. Inform. Theory

173

297

View full text Add to dashboard Cite

Abstract-We address the problem of securing distributed storage systems against eavesdropping and adversarial attacks. An important aspect of these systems is node failures over time, necessitating, thus, a repair mechanism in order to maintain a desired high system reliability. In such dynamic settings, an important security problem is to safeguard the system from an intruder who may come at different time instances during the lifetime of the storage system to observe and possibly alter the data stored on some nodes. In this scenario, we give upper bounds on the maximum amount of information that can be stored safely on the system. For an important operating regime of the distributed storage system, which we call the bandwidthlimited regime, we show that our upper bounds are tight and provide explicit code constructions. Moreover, we provide a way to short list the malicious nodes and expurgate the system.

show abstract

Private Information Retrieval From MDS Coded Data in Distributed Storage Systems

Tajeddine

Gnilke

Rouayheb

2018

IEEE Trans. Inform. Theory

217

243

View full text Add to dashboard Cite

The problem of providing privacy, in the private information retrieval (PIR) sense, to users requesting data from a distributed storage system (DSS), is considered. The DSS is coded by an (n, k, d) Maximum Distance Separable (MDS) code to store the data reliably on unreliable storage nodes. Some of these nodes can be spies which report to a third party, such as an oppressive regime, which data is being requested by the user. An information theoretic PIR scheme ensures that a user can satisfy its request while revealing no information on which data is being requested to the nodes. A user can trivially achieve PIR by downloading all the data in the DSS. However, this is not a feasible solution due to its high communication cost. We construct PIR schemes with low download communication cost. When there is b = 1 spy node in the DSS, in other words, no collusion between the nodes, we construct PIR schemes with download cost 1 1−R per unit of requested data (R = k/n is the code rate), achieving the information theoretic limit for linear schemes. The proposed schemes are universal since they depend on the code rate, but not on the generator matrix of the code. Also, if b ≤ n − δk nodes collude, with δ = n−b k , we construct linear PIR schemes with download cost b+δk δ .

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.