2021
DOI: 10.1007/978-3-030-67661-2_21
|View full text |Cite
|
Sign up to set email alerts
|

FedMAX: Mitigating Activation Divergence for Accurate and Communication-Efficient Federated Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 6 publications
0
19
0
Order By: Relevance
“…Results indicate that the attention aggregation outperforms the gradient descent. Chen et al [62] investigate the problem of activation divergence that occurs during the aggregation of models in FL, which can drastically increase their training time and reduce their accuracy. The authors propose a solution for this problem with a method that maximizes the entropy of activation vectors across all FL participants.…”
Section: Aggregation Methodologiesmentioning
confidence: 99%
“…Results indicate that the attention aggregation outperforms the gradient descent. Chen et al [62] investigate the problem of activation divergence that occurs during the aggregation of models in FL, which can drastically increase their training time and reduce their accuracy. The authors propose a solution for this problem with a method that maximizes the entropy of activation vectors across all FL participants.…”
Section: Aggregation Methodologiesmentioning
confidence: 99%
“…The authors of [45] proposed the FedProx algorithm, which can enable partial amounts of work to be collected from the agents; however, they randomly selected agents for the training round. According to the authors of [46], FedMax outperforms FedProx in terms of communication rounds by applying a strategy of limiting activation-divergence across multiple devices.…”
Section: Literature Reviewmentioning
confidence: 99%
“…This approach allows the number of epochs to be tuned based on the non-IIDness of the client data. While it addresses the weight divergence issue with FedAvg, the convergence speed is slower at higher number of epochs when compared to other state-of-the-art algorithms [6,13,14,15]. FedMA offers the best accuracy and convergence speed in comparison to others but comes with significant compute cost on the client devices.…”
Section: Fedavg and Its Challengesmentioning
confidence: 99%
“…In this paper, we focus on the Optimization Algorithm approach to address the non-IID challenge. While there are numerous state-of-the-art algorithms like FedProx [12], FedMA [13], FedMAX [14] etc., these approaches are not productized in a large scale to the best of knowledge of the authors. Hence, we focus on the most widely deployed FedAvg algorithm [1] and investigate improving its ability to handle non-IID data to the same level as state-of-the-art algorithms like FedMA, FedProx, FedMAX, etc.…”
Section: Introductionmentioning
confidence: 99%