2021
DOI: 10.48550/arxiv.2106.06843
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Federated Learning on Non-IID Data: A Survey

Abstract: Federated learning is an emerging distributed machine learning framework for privacy preservation. However, models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode, especially when the training data are not independent and identically distributed (Non-IID) on the local devices. In this survey, we provide a detailed analysis of the influence of Non-IID data on both parametric and non-parametric machine learning models in both horizontal an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 15 publications
(20 citation statements)
references
References 139 publications
(154 reference statements)
0
20
0
Order By: Relevance
“…6, it is clear to see that the test performances on these three datasets are relatively insensitive to different numbers of clients. These results make sense since non-parametric models trained in the VFL setting have nearly the same performance as those trained in the standard centralized learning [44]. By contrast, however, the average test accuracy on the Credit dataset drops from 0.82 to 0.8 when the number of clients is reduced to 2.…”
Section: B Sensitivity Analysismentioning
confidence: 93%
“…6, it is clear to see that the test performances on these three datasets are relatively insensitive to different numbers of clients. These results make sense since non-parametric models trained in the VFL setting have nearly the same performance as those trained in the standard centralized learning [44]. By contrast, however, the average test accuracy on the Credit dataset drops from 0.82 to 0.8 when the number of clients is reduced to 2.…”
Section: B Sensitivity Analysismentioning
confidence: 93%
“…• IID-ness of client datasets. The test accuracy of a model learned using FL is adversely impacted if the datasets on the remote clients are not independent and identically distributed (IID) [80,81]. The fundamental reason for this performance degradation is that when the client datasets are non-IID or heterogeneous, the local models trained on the clients may diverge, despite having the same initial parameters.…”
Section: Key Elements Of a Federated Learning Systemmentioning
confidence: 99%
“…Periodically, the updated global model is pushed to each device, allowing individual users to benefit from the experience of the model on other devices. Federated Learning is not a trivial problem as the heterogeneous (non-iid) data generated by different clients causes the local models to diverge, making it difficult to aggregate the local updates into a global update [33]. Despite being a relatively new research field, Federated Learning is already deployed in practice [34].…”
Section: Retraining and Personalizing Modelsmentioning
confidence: 99%