Abstract:Federated learning (FL) is increasingly becoming the norm for training models over distributed and private datasets. Major service providers rely on FL to improve services such as text auto-completion, virtual keyboards, and item recommendations. Nonetheless, training models with FL in practice requires signicant amount of time (days or even weeks) because FL tasks execute in highly heterogeneous environments where devices only have widespread yet limited computing capabilities and network connectivity conditi… Show more
“…We expect that the heterogeneity impact might be reduced by means of proportional load balancing during the selection phase [37], computation offloading techniques at the edge [50], adaptive compression of model for computation [3] and communication [2,4] or asynchronous mode of model updates [30].…”
Federated learning (FL) is becoming a popular paradigm for collaborative learning over distributed, private datasets owned by non-trusting entities. FL has seen successful deployment in production environments, and it has been adopted in services such as virtual keyboards, auto-completion, item recommendation, and several IoT applications. However, FL comes with the challenge of performing training over largely heterogeneous datasets, devices, and networks that are out of the control of the centralized FL server. Motivated by this inherent setting, we make a first step towards characterizing the impact of device and behavioral heterogeneity on the trained model. We conduct an extensive empirical study spanning close to 1.5K unique configurations on five popular FL benchmarks. Our analysis shows that these sources of heterogeneity have a major impact on both model performance and fairness, thus shedding light on the importance of considering heterogeneity in FL system design.
“…We expect that the heterogeneity impact might be reduced by means of proportional load balancing during the selection phase [37], computation offloading techniques at the edge [50], adaptive compression of model for computation [3] and communication [2,4] or asynchronous mode of model updates [30].…”
Federated learning (FL) is becoming a popular paradigm for collaborative learning over distributed, private datasets owned by non-trusting entities. FL has seen successful deployment in production environments, and it has been adopted in services such as virtual keyboards, auto-completion, item recommendation, and several IoT applications. However, FL comes with the challenge of performing training over largely heterogeneous datasets, devices, and networks that are out of the control of the centralized FL server. Motivated by this inherent setting, we make a first step towards characterizing the impact of device and behavioral heterogeneity on the trained model. We conduct an extensive empirical study spanning close to 1.5K unique configurations on five popular FL benchmarks. Our analysis shows that these sources of heterogeneity have a major impact on both model performance and fairness, thus shedding light on the importance of considering heterogeneity in FL system design.
“…In this context, device heterogeneity results in performance degradation due to stragglers (i.e., slow workers) who slow down the training process [48], [49]. Several works tried to address this problem via the system and algorithmic solutions [20], [39], [46], [48]- [50]. In FL settings, the heterogeneity is sourced from other system artifacts and is not limited to the heterogeneity in device capabilities.…”
Federated learning (FL) is becoming a popular paradigm for collaborative learning over distributed, private datasets owned by non-trusting entities. FL has seen successful deployment in production environments, and it has been adopted in services such as virtual keyboards, auto-completion, item recommendation, and several IoT applications. However, FL comes with the challenge of performing training over largely heterogeneous datasets, devices, and networks that are out of the control of the centralized FL server. Motivated by this inherent challenge, we aim to empirically characterize the impact of device and behavioral heterogeneity on the trained model. We conduct an extensive empirical study spanning nearly 1.5K unique configurations on five popular FL benchmarks. Our analysis shows that these sources of heterogeneity have a major impact on both model quality and fairness, causing up to 4.6× and 2.2× degradation in the quality and fairness, respectively, thus shedding light on the importance of considering heterogeneity in FL system design.
“…Additionally, the effectiveness of SFL with privacy and resilience safeguards is assessed in more extensive experimental situations. • Heterogeneity: FL faces a considerable challenge when operating in various devices and data of the whole system [71][72][73]. Indeed, increasingly intelligent devices can connect to train the FL system.…”
Section: Challenges Of Federated Learningmentioning
confidence: 99%
“…Extensive experiments on MNIST, FashionMNIST, MedMNIST, and CIFAR-10 demonstrate that their suggested approaches can achieve satisfactory performance with guaranteed convergence and efficiently use all the resources available for training across different devices with lower communication cost than its homogeneous counterpart. Abdelmoniem, A.M., and Canini, M. [73] also concentrate on reducing the degree of device heterogeneity by suggesting AQFL, a straightforward and useful method that uses adaptive model quantization to homogenize the customers' computational resources. They assess AQFL using five standard FL metrics.…”
Section: Challenges Of Federated Learningmentioning
The air quality index (AQI) forecast in big cities is an exciting study area in smart cities and healthcare on the Internet of Things. In recent years, a large number of empirical, academic, and review papers using machine learning (ML) for air quality analysis have been published. However, most of those studies focused on traditional centralized processing on a single machine, and there had been few surveys of federated learning (FL) in this field. This overview aims to fill this gap and provide newcomers with a broader perspective to inform future research on this topic, especially for the multi-model approach. In this survey, we went over the works that previous scholars have conducted in AQI forecast both in traditional ML approaches and FL mechanisms. Our objective is to comprehend previous research on AQI prediction including methods, models, data sources, achievements, challenges, and solutions applied in the past. We also convey a new path of using multi-model FL, which has piqued the computer science community’s interest recently.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.