Federated Learning with Non-IID Data

Zhao, Yue; Li, Meng; Lai, Liangzhen; Suda, Naveen; Civin, Damon; Chandra, Vikas

doi:10.48550/arxiv.1806.00582

Cited by 642 publications

(971 citation statements)

References 18 publications

Supporting

Mentioning

824

Contrasting

Unclassified

Order By: Relevance

“…The first FL framework, Federated Averaging (Fe-dAvg) (McMahan et al 2017) glb for the next training round. Although FedAvg achieves superior performance on homogeneous clients with IID data, its performance degrades when data distribution among client devices is non-IID (Zhao et al 2018;Wang et al 2020a). While multiple FL frameworks were proposed to mitigate the impact of statistical diversity by either controlling the divergence between the local model and global model (Li et al 2020;Karimireddy et al 2019), or by training a personalized model for each client (Smith et al 2017;Deng, Kamani, and Mahdavi 2020;Zhang et al 2020), none of these studies has considered system performance objectives in their design.…”

Section: Federated Learningmentioning

confidence: 99%

A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning

Zhang¹,

Lin²,

Zhang³

2022

Preprint

View full text Add to dashboard Cite

Federated learning (FL) is a training technique that enables client devices to jointly learn a shared model by aggregating locally-computed models without exposing their raw data. While most of the existing work focuses on improving the FL model accuracy, in this paper, we focus on the improving the training efficiency, which is often a hurdle for adopting FL in real-world applications. Specifically, we design an efficient FL framework which jointly optimizes model accuracy, processing latency and communication efficiency, all of which are primary design considerations for real implementation of FL. Inspired by the recent success of Multi-Agent Reinforcement Learning (MARL) in solving complex control problems, we present FedMarl, an MARL-based FL framework which performs efficient run-time client selection. Experiments show that FedMarl can significantly improve model accuracy with much lower processing latency and communication cost.

show abstract

Section: Federated Learningmentioning

confidence: 99%

A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning

Zhang¹,

Lin²,

Zhang³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…However, FedAvg may not converge if data from different clients is non-i.i.d. (Zhao et al, 2018;Li et al, 2019) and some clients do not regularly participate in the training (Yang et al, 2021), as is often the case in federated learning scenarios. We show similar results for federated graph training.…”

Section: Related Workmentioning

confidence: 99%

“…This data, however, is often privacy-sensitive: for example, users in a social network may not want to reveal the websites they have visited. In non-graphical settings, distributed learning has recently shown promise for preserving user privacy while training accurate models, e.g., federated learning algorithms have become increasingly popular (Yang et al, 2021;Zhao et al, 2018). Some papers have begun to apply federated algo-1 Code available at https://github.com/yh-yao/Federated-GCN rithms to training GCNs (He et al, 2021;Wang et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

FedGCN: Convergence and Communication Tradeoffs in Federated Training of Graph Convolutional Networks

Yao¹,

Joe‐Wong²

2022

Preprint

View full text Add to dashboard Cite

Distributed methods for training models on graph datasets have recently grown in popularity, due to the size of graph datasets as well as the private nature of graphical data like social networks. However, the graphical structure of this data means that it cannot be disjointly partitioned between different learning clients, leading to either significant communication overhead between clients or a loss of information available to the training method. We introduce Federated Graph Convolutional Network (FedGCN), which uses federated learning to train GCN models with optimized convergence rate and communication cost. Compared to prior methods that require communication among clients at each iteration, FedGCN preserves the privacy of client data and only needs communication at the initial step, which greatly reduces communication cost and speeds up the convergence rate. We theoretically analyze the tradeoff between FedGCN's convergence rate and communication cost under different data distributions, introducing a general framework can be generally used for the analysis of all edge-completionbased GCN training algorithms. Experimental results demonstrate the effectiveness of our algorithm and validate our theoretical analysis 1 .

show abstract

“…In ideal cases, Independent and Identically Distributed (IID) data are desired, both in terms of features as well as classes. However, in real scenarios, non-IID data varying from moderate to strongly skewed distribution is inevitable [5,6]. The global class imbalance problem may prevail and adversely affect the performance of the global model [7].…”

Section: Introductionmentioning

confidence: 99%

Client Selection in Federated Learning under Imperfections in Environment

Rai

Kumari

Prasad

2022

View full text Add to dashboard Cite

Federated learning promises an elegant solution for learning global models across distributed and privacy-protected datasets. However, challenges related to skewed data distribution, limited computational and communication resources, data poisoning, and free riding clients affect the performance of federated learning. Selection of the best clients for each round of learning is critical in alleviating these problems. We propose a novel sampling method named the irrelevance sampling technique. Our method is founded on defining a novel irrelevance score that incorporates the client characteristics in a single floating value, which can elegantly classify the client into three numerical sign defined pools for easy sampling. It is a computationally inexpensive, intuitive and privacy preserving sampling technique that selects a subset of clients based on quality and quantity of data on edge devices. It achieves 50–80% faster convergence even in highly skewed data distribution in the presence of free riders based on lack of data and severe class imbalance under both Independent and Identically Distributed (IID) and Non-IID conditions. It shows good performance on practical application datasets.

show abstract

Federated Learning with Non-IID Data

Cited by 642 publications

References 18 publications

A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning

A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning

FedGCN: Convergence and Communication Tradeoffs in Federated Training of Graph Convolutional Networks

Client Selection in Federated Learning under Imperfections in Environment

Contact Info

Product

Resources

About