2021
DOI: 10.48550/arxiv.2104.03490
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Joint Optimization of Communications and Federated Learning Over the Air

Xin Fan,
Yue Wang,
Yan Huo
et al.

Abstract: Federated learning (FL) is an attractive paradigm for making use of rich distributed data while protecting data privacy. Nonetheless, nonideal communication links and limited transmission resources have become the bottleneck of the implementation of fast and accurate FL. In this paper, we study joint optimization of communications and FL based on analog aggregation transmission in realistic wireless networks. We first derive a closed-form expression for the expected convergence rate of FL over the air, which t… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 23 publications
(53 reference statements)
0
9
0
Order By: Relevance
“…In [13], the authors employ random projection of the sparsified model updates at the devices, which allows the devices to significantly reduce the bandwidth requirement without sacrificing the performance. The authors in [109] first analyzed how user selction and transmit power affect the convergence of AirComp based FL and then optimized these wireless factors to improve the performance of AirComp based FL. The work in [110] studied the use of 1-bit compressive sensing (CS) for analog ML model aggregation thus reduce the size of FL parameters transmitted over wireless links.…”
Section: State-of-the-art and Research Opportunitiesmentioning
confidence: 99%
“…In [13], the authors employ random projection of the sparsified model updates at the devices, which allows the devices to significantly reduce the bandwidth requirement without sacrificing the performance. The authors in [109] first analyzed how user selction and transmit power affect the convergence of AirComp based FL and then optimized these wireless factors to improve the performance of AirComp based FL. The work in [110] studied the use of 1-bit compressive sensing (CS) for analog ML model aggregation thus reduce the size of FL parameters transmitted over wireless links.…”
Section: State-of-the-art and Research Opportunitiesmentioning
confidence: 99%
“…The convergence behavior demonstrates that the noisy iterates typically introduce non-negligible optimality gap in various FL algorithms, e.g., vanilla gradient method [174], quantized gradient method [175], sparsified gradient method [173], and operator splitting method [87]. The optimality gap can be further controlled by transmit power allocation [176], [41], [173], model aggregation receiver beamforming design [177], [31], [178], and device scheduling [178], [31], [179]. Besides, channel perturbation in algorithm iterates can also serve as the mechanism to design saddle points escaping algorithms [94], thereby establishing global optimality for training the non-convex over-parameterized neural networks in high-dimensional statistical settings [180].…”
Section: B Wireless Techniques For Edge Trainingmentioning
confidence: 99%
“…Specifically, for edge training systems via AirComp, the global model aggregation errors due to the wireless channel fading and noise will cause learning performance degradation [45], [87]. The optimality gap (i.e., the distance between the current iterate and the desired solution), characterized by the convergence behavior of the global iterate, can be further controlled by various resource allocation schemes, including edge devices transmit power control [41], [237], edge server receive beamforming [31], [179], passive beamforming at RIS [48], [178], as well as device scheduling policy [31], [48]. For digital design of the edge training system, the optimality basically depends on the edge devices selection, packet errors in the uplink transmission, and model parameter partition, for which user scheduling [238], power control [88], batchsize selection [239], aggregation frequency control [240], and bandwidth allocation [241] were provided to improve the accuracy in the edge training process.…”
Section: ) Accuracymentioning
confidence: 99%
“…Benefitting from communication-efficient gradient aggregation, FLOA as a cross-disciplinary topic has attracted growing research interests in the fields of communications, optimization, and machine learning, such as power control [16], [19], [29], devices scheduling [16], [18], [25],…”
Section: Introductionmentioning
confidence: 99%
“…For instance, a broadband analog aggregation scheme for power control and device scheduling in FLOA is proposed in [18], where a set of tradeoffs between communications and learning are discussed. In [16], convergence analysis quantifies the impact of AirComp on FL and then a joint optimization of communication and learning for the optimal power scaling and device scheduling is proposed. Considering energy-constrained local devices, an energy-aware device scheduling strategy is proposed in [25] to maximize the average number of workers scheduled for gradient update.…”
Section: Introductionmentioning
confidence: 99%