Model Pruning Enables Efficient Federated Learning on Edge Devices

Jiang, Yuang; Wang, Shiqiang; Valls, Víctor; Ko, Bong Jun; Lee, Wei-Han; Leung, Kin K.

doi:10.48550/arxiv.1909.12326

Cited by 24 publications

(29 citation statements)

References 17 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As introduced in Section II, the communication overhead between each client and the central server is proportional to the desired rank r for compression. However, since the energybased criterion in (5) does not adjust the rank directly, it is unclear how the communication overhead changes during the training process, which is studied in the following theorem. First, following [16], we extend Assumption 2 to the following assumption.…”

Section: B Communication Overheadmentioning

confidence: 99%

See 1 more Smart Citation

Communication-Efficient Federated Learning with Dual-Side Low-Rank Compression

Qiao¹,

Yu²,

Zhang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Federated learning (FL) is a promising and powerful approach for training deep learning models without sharing the raw data of clients. During the training process of FL, the central server and distributed clients need to exchange a vast amount of model information periodically. To address the challenge of communication-intensive training, we propose a new training method, referred to as federated learning with dualside low-rank compression (FedDLR), where the deep learning model is compressed via low-rank approximations at both the server and client sides. The proposed FedDLR not only reduces the communication overhead during the training stage but also directly generates a compact model to speed up the inference process. We shall provide convergence analysis, investigate the influence of the key parameters, and empirically show that FedDLR outperforms the state-of-the-art solutions in terms of both the communication and computation efficiency.

show abstract

Section: B Communication Overheadmentioning

confidence: 99%

“…For local training, we use cross entropy as the loss function and perform 25 local training iterations. To show the effectiveness of the proposed FedDLR, we choose FedAvg [2] and PruneFL [5] as the baselines to compare.…”

Section: A Experimental Settingmentioning

confidence: 99%

Communication-Efficient Federated Learning with Dual-Side Low-Rank Compression

Qiao¹,

Yu²,

Zhang³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Secondly, communication can be reduced by reducing the model size because the model size is proportional to the amount of training communication. PruneFL (Jiang et al, 2019) progressively prunes the model over the course of training, while AFD (Bouacida et al, 2021) only trains submodels on clients.…”

Section: Related Workmentioning

confidence: 99%

DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning

Hönig¹,

Zhao²,

Mullins³

2021

Preprint

View full text Add to dashboard Cite

Federated Learning (FL) is a powerful technique for training a model on a server with data from several clients in a privacy-preserving manner. In FL, a server sends the model to every client, who then train the model locally and send it back to the server. The server aggregates the updated models and repeats the process for several rounds. FL incurs significant communication costs, in particular when transmitting the updated local models from the clients back to the server. Recently proposed algorithms quantize the model parameters to efficiently compress FL communication. These algorithms typically have a quantization level that controls the compression factor. We find that dynamic adaptations of the quantization level can boost compression without sacrificing model quality. First, we introduce a time-adaptive quantization algorithm that increases the quantization level as training progresses. Second, we introduce a client-adaptive quantization algorithm that assigns each individual client the optimal quantization level at every round. Finally, we combine both algorithms into DAdaQuant, the doubly-adaptive quantization algorithm. Our experiments show that DAdaQuant consistently improves client→server compression, outperforming the strongest non-adaptive baselines by up to 2.8×.

show abstract

“…Thus, within the data preprocessor, an optional component heterogeneous data handler is adopted to deal with the non-IID and skewed data distribution issue through data augmentation techniques. The known uses of the component include Astraea 16 , FAug scheme [14] and Federated Distillation (FD) method [2].…”

Section: Data Collection and Preprocessingmentioning

confidence: 99%

“…A message compressor component can be added to improve communication efficiency. The embedded pattern are extracted from Google Sketched Update [20], and IBM PruneFL [16].…”

Section: Model Trainingmentioning

confidence: 99%

FLRA: A Reference Architecture for Federated Learning Systems

Lo¹,

Lu²,

Paik³

et al. 2021

Preprint

View full text Add to dashboard Cite

Federated learning is an emerging machine learning paradigm that enables multiple devices to train models locally and formulate a global model, without sharing the clients' local data. A federated learning system can be viewed as a large-scale distributed system, involving different components and stakeholders with diverse requirements and constraints. Hence, developing a federated learning system requires both software system design thinking and machine learning knowledge. Although much effort has been put into federated learning from the machine learning perspectives, our previous systematic literature review on the area shows that there is a distinct lack of considerations for software architecture design for federated learning. In this paper, we propose FLRA, a reference architecture for federated learning systems, which provides a template design for federated learning-based solutions. The proposed FLRA reference architecture is based on an extensive review of existing patterns of federated learning systems found in the literature and existing industrial implementation. The FLRA reference architecture consists of a pool of architectural patterns that could address the frequently recurring design problems in federated learning architectures. The FLRA reference architecture can serve as a design guideline to assist architects and developers with practical solutions for their problems, which can be further customised.

show abstract

Model Pruning Enables Efficient Federated Learning on Edge Devices

Cited by 24 publications

References 17 publications

Communication-Efficient Federated Learning with Dual-Side Low-Rank Compression

Communication-Efficient Federated Learning with Dual-Side Low-Rank Compression

DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning

FLRA: A Reference Architecture for Federated Learning Systems

Contact Info

Product

Resources

About