2022
DOI: 10.1145/3517821
|View full text |Cite
|
Sign up to set email alerts
|

Auto-weighted Robust Federated Learning with Corrupted Data Sources

Abstract: Federated learning provides a communication-efficient and privacy-preserving training process by enabling learning statistical models with massive participants without accessing their local data. Standard federated learning techniques that naively minimize an average loss function are vulnerable to data corruptions from outliers, systematic mislabeling, or even adversaries. In this paper, we address this challenge by proposing Auto-weighted Robust Federated Learning ( ARFL ), a novel ap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 21 publications
0
4
0
Order By: Relevance
“…For the IID partitioning, we randomly split the training set into 20 subsets and allocate them to the 20 clients, thus each client has 2500 training samples. For Non-IID partition, we follow prior work [34], [35] and model the Non-IID data distributions with a Dirichlet distribution p l ∼ Dir K (α), then we allocate a p l,k proportion of the training sample of class l to client k, in which a smaller α indicates stronger Non-IID data partition. We let α = 0.1 and visualize the resulting statistical heterogeneity of labels in Fig.…”
Section: Methodsmentioning
confidence: 99%
“…For the IID partitioning, we randomly split the training set into 20 subsets and allocate them to the 20 clients, thus each client has 2500 training samples. For Non-IID partition, we follow prior work [34], [35] and model the Non-IID data distributions with a Dirichlet distribution p l ∼ Dir K (α), then we allocate a p l,k proportion of the training sample of class l to client k, in which a smaller α indicates stronger Non-IID data partition. We let α = 0.1 and visualize the resulting statistical heterogeneity of labels in Fig.…”
Section: Methodsmentioning
confidence: 99%
“…For the IID partitioning, we randomly split the training set into 20 subsets and allocate them to the 20 clients. For the Non-IID partition, we follow prior work [38], [39] and model the Non-IID data distributions with a Dirichlet distribution p l ∼ Dir K (α). Then we allocate a p l,k proportion of the training sample of class l to client k, in which a smaller α indicates a stronger Non-IID data partition.…”
Section: Methodsmentioning
confidence: 99%
“…Similar to other schemes, ClippedClustering is also less robust in Non-IID data scenarios. We note that Non-IIDness is a well-known challenge to FL especially when it comes to robustness [39], as it becomes hard to induce a consensus model for the benign clients if their data distributions are significantly different. Thus the malicious clients can damage the global model more easily.…”
Section: Impact On Robust Aggregation Schemesmentioning
confidence: 99%
“…To achieve this, a GAN is trained over leaf blight images only, with a subset of nodes that correspond only to the malicious clients, namely 30% of the whole FL system. We selected this number because a significant number of comparable studies that assess defense mechanisms or suggest attacks employ a percentage similar to this one for the malevolent clients [58][59][60]. The GAN has been trained for 80 rounds using the joint malicious dataset.…”
Section: Gan-generated Imagesmentioning
confidence: 99%