2019
DOI: 10.48550/arxiv.1909.12326
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Model Pruning Enables Efficient Federated Learning on Edge Devices

Abstract: Federated learning is a recent approach for distributed model training without sharing the raw data of clients. It allows model training using the large amount of user data collected by edge and mobile devices, while preserving data privacy. A challenge in federated learning is that the devices usually have much lower computational power and communication bandwidth than machines in data centers. Training large-sized deep neural networks in such a federated setting can consume a large amount of time and resourc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(29 citation statements)
references
References 17 publications
(21 reference statements)
0
25
0
Order By: Relevance
“…As introduced in Section II, the communication overhead between each client and the central server is proportional to the desired rank r for compression. However, since the energybased criterion in (5) does not adjust the rank directly, it is unclear how the communication overhead changes during the training process, which is studied in the following theorem. First, following [16], we extend Assumption 2 to the following assumption.…”
Section: B Communication Overheadmentioning
confidence: 99%
See 1 more Smart Citation
“…As introduced in Section II, the communication overhead between each client and the central server is proportional to the desired rank r for compression. However, since the energybased criterion in (5) does not adjust the rank directly, it is unclear how the communication overhead changes during the training process, which is studied in the following theorem. First, following [16], we extend Assumption 2 to the following assumption.…”
Section: B Communication Overheadmentioning
confidence: 99%
“…For local training, we use cross entropy as the loss function and perform 25 local training iterations. To show the effectiveness of the proposed FedDLR, we choose FedAvg [2] and PruneFL [5] as the baselines to compare.…”
Section: A Experimental Settingmentioning
confidence: 99%
“…Secondly, communication can be reduced by reducing the model size because the model size is proportional to the amount of training communication. PruneFL (Jiang et al, 2019) progressively prunes the model over the course of training, while AFD (Bouacida et al, 2021) only trains submodels on clients.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, within the data preprocessor, an optional component heterogeneous data handler is adopted to deal with the non-IID and skewed data distribution issue through data augmentation techniques. The known uses of the component include Astraea 16 , FAug scheme [14] and Federated Distillation (FD) method [2].…”
Section: Data Collection and Preprocessingmentioning
confidence: 99%
“…A message compressor component can be added to improve communication efficiency. The embedded pattern are extracted from Google Sketched Update [20], and IBM PruneFL [16].…”
Section: Model Trainingmentioning
confidence: 99%