AdaSplit: Adaptive Trade-offs for Resource-constrained Distributed Deep Learning

Chopra, Ayush; Sahu, Surya Kant; Singh, Abhishek; Java, Abhinav; Vepakomma, Praneeth; Sharma, Vivek; Raskar, Ramesh

doi:10.48550/arxiv.2112.01637

Cited by 4 publications

(5 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[55] seeks to reduce the network delay incurred by FL by performing communication and local learning concurrently, at the price of the global model being behind local ones by several epochs. In a similar setting, [56] optimizes the computation, communication, and cooperation aspects of FL in resource-constrained scenarios. [57] leverages RL to identify the best split of a learning task (e.g., the layers of a DNN) across the available network nodes.…”

Section: Related Workmentioning

confidence: 99%

Tuning DNN Model Compression to Resource and Data Availability in Cooperative Training

Malandrino,

Giacomo,

Karamzade

et al. 2024

IEEE/ACM Trans. Networking

View full text Add to dashboard Cite

Model compression is a fundamental tool to execute machine learning (ML) tasks on the diverse set of devices populating current-and next-generation networks, thereby exploiting their resources and data. At the same time, how much and when to compress ML models are very complex decisions, as they have to jointly account for such aspects as the model being used, the resources (e.g., computational) and local datasets available at each node, as well as network latencies. In this work, we address the multi-dimensional problem of adapting the model compression, data selection, and node allocation decisions to each other: our objective is to perform the DNN training at the minimum energy cost, subject to learning quality and time constraints. To this end, we propose an algorithmic framework called PACT, combining a time-expanded graph representation of the training process, a dynamic programming solution strategy, and a data-driven approach to the estimation of the loss evolution. We prove that PACT's complexity is polynomial, and its decisions can get arbitrarily close to the optimum. Through our numerical evaluation, we further show how PACT can consistently outperform state-of-the-art alternatives and closely matches the optimal energy consumption.

show abstract

Section: Related Workmentioning

confidence: 99%

Tuning DNN Model Compression to Resource and Data Availability in Cooperative Training

Malandrino,

Giacomo,

Karamzade

et al. 2024

IEEE/ACM Trans. Networking

View full text Add to dashboard Cite

show abstract

“…Chopra et al enable device heterogeneity in split learning, where the full NN is split, and parts of the model are trained on the devices while the other part is trained on the server. They present AdaSplit [20], which allows for different device model sizes by varying the split position between the device and the server. While in baseline split learning, activations have to be uploaded to the server, and gradients have to be downloaded from the server, AdaSplit mitigates this by using a contrastive loss to train locally without server interaction and send activations to the server only after the local phase.…”

Section: Nn Architecture Heterogeneity Based On Other Techniquesmentioning

confidence: 99%

Federated Learning for Computationally Constrained Heterogeneous Devices: A Survey

et al. 2023

View full text Add to dashboard Cite

With an increasing number of smart devices like internet of things (IoT) devices deployed in the field, offloading training of neural networks (NNs) to a central server becomes more and more infeasible. Recent efforts to improve users’ privacy have led to on-device learning emerging as an alternative. However, a model trained only on a single device, using only local data, is unlikely to reach a high accuracy. Federated learning (FL) has been introduced as a solution, offering a privacy-preserving trade-off between communication overhead and model accuracy by sharing knowledge between devices but disclosing the devices’ private data. The applicability and the benefit of applying baseline FL are, however, limited in many relevant use cases due to the heterogeneity present in such environments. In this survey, we outline the heterogeneity challenges FL has to overcome to be widely applicable in real-world applications. We especially focus on the aspect of computation heterogeneity among the participating devices and provide a comprehensive overview of recent works on heterogeneity-aware FL. We discuss two groups: works that adapt the NN architecture and works that approach heterogeneity on a system level, covering Federated Averaging (FedAvg), distillation, and split learning-based approaches, as well as synchronous and asynchronous aggregation schemes.

show abstract

“…It enabled a data scientist to maintain raw data on an owner's device while training neural networks on vertically partitioned data features among several owners. Adasplit [48] is another hybrid approach of SL and FL, that enabled efficient scaling to SL to low resource scenarios in reducing bandwidth consumption and improving performance across heterogeneous clients. The authors in [49] suggested a hybrid approach to updating client-and server-side models simultaneously through local-loss-based training.…”

Section: ) Hybrid Split-federated Learningmentioning

confidence: 99%

Decentralized Learning in Healthcare: A Review of Emerging Techniques

2023

View full text Add to dashboard Cite

Recent developments in deep learning have contributed to numerous success stories in healthcare. The performance of a deep learning model generally improves with the size of the training data. However, there are privacy, ownership, and regulatory issues that prevent combining medical data into traditional centralized storage. Decentralized learning approaches enable collaborative model training by distributing the learning process among several nodes or devices. Conceptually, decentralized learning builds on earlier work in distributed optimization, but the focus of this paper is on recent and emerging techniques such as Federated Learning (FL), Split Learning (SL), and hybrid Split-Federated Learning (SFL). With common, universal deep learning models and centralized aggregator servers, FL overcomes the difficulties of centralized training. Additionally, patient data remains at the local party, upholding the security and anonymity of the data. SL enables machine learning without directly accessing data on clients or end devices. It further enhances privacy in a decentralized setting and mitigates clients' storage issues. In this survey, we first provide a contemporary survey of FL, SL, and SFL approaches. Second, we discuss their state-ofthe-art applications in healthcare, particularly in medical image analysis. Third, we review these emerging decentralized learning approaches under challenging conditions such as statistical and system heterogeneity, privacy preservation, communication efficiency, fairness, etc. Then, we address existing approaches to tackle these challenges. We detail unique complications related to healthcare applications including data, privacy and security, and communication challenges. Finally, we outline potential areas for further research on emerging decentralized learning techniques in healthcare, including developing personalized models, reducing bias, incorporating hybrid non-IID features, hyperparameter tuning, developing sufficient incentive mechanisms, and incorporating domain expertise knowledge.

show abstract

AdaSplit: Adaptive Trade-offs for Resource-constrained Distributed Deep Learning

Cited by 4 publications

References 56 publications

Tuning DNN Model Compression to Resource and Data Availability in Cooperative Training

Tuning DNN Model Compression to Resource and Data Availability in Cooperative Training

Federated Learning for Computationally Constrained Heterogeneous Devices: A Survey

Decentralized Learning in Healthcare: A Review of Emerging Techniques

Contact Info

Product

Resources

About