Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications 2019
DOI: 10.1117/12.2519621
|View full text |Cite
|
Sign up to set email alerts
|

Approaches to address the data skew problem in federated learning

Abstract: A Federated Learning approach consists of creating an AI model from multiple data sources, without moving large amounts of data across to a central environment. Federated learning can be very useful in a tactical coalition environment, where data can be collected individually by each of the coalition partners, but network connectivity is inadequate to move the data to a central environment. However, such data collected is often dirty and imperfect. The data can be imbalanced, and in some cases, some classes ca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 8 publications
0
9
0
Order By: Relevance
“…In these scenarios, it would not be beneficial to have a naive implementation of an FL model because the resulting model would be inaccurate [105]. To remedy the data skewing problem, we would need to rely on the server [106]. Furthermore, data skewing can even affect different kinds of data that can be used in FL.…”
Section: Challenges and Limitations Of Federated Learningmentioning
confidence: 99%
“…In these scenarios, it would not be beneficial to have a naive implementation of an FL model because the resulting model would be inaccurate [105]. To remedy the data skewing problem, we would need to rely on the server [106]. Furthermore, data skewing can even affect different kinds of data that can be used in FL.…”
Section: Challenges and Limitations Of Federated Learningmentioning
confidence: 99%
“…Since each local device records the activities of its owner, data across devices tend to have different sizes, features, and target classes distribution. This technically means that the local data of one single client cannot be considered to be representative of the overall data distribution [78]. Three scenarios of non-IID data in federated learning can be encountered, i.e., class imbalance, distribution imbalance and size imbalance.…”
Section: Non Independent and Identically Distributed Datamentioning
confidence: 99%
“…For training data that consists of features (categories I and III in Table I), the data collected at different sites may correspond to different regions of the feature space. Extrapolating across different regions of applicability can cause mistakes in the estimation of the AI model [9]. Different sites may also have collected different volumes of training data, which may make some trained models to be better than others.…”
Section: Challenges In Federated Learningmentioning
confidence: 99%