Hardware Resource Analysis in Distributed Training with Edge Devices

Park, Sihyeong; Lee, Jemin; Kim, Hyungshin

doi:10.3390/electronics9010028

Cited by 4 publications

(3 citation statements)

References 23 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MXNet uses KVStore 1 to synchronize parameters shared among participants during the learning process. To monitor the utilization of pervasive resources, Ganglia [55] is designed to identify memory, CPU, and network requirements of the training and track the hardware usage for each participant. As for the inference phase, authors in [56] designed a hardware prototype targeting distributed deep learning for on-device prediction.…”

Section: Pervasive Framework For Aimentioning

confidence: 99%

Pervasive AI for IoT Applications: A Survey on Resource-Efficient Distributed Artificial Intelligence

Baccour

Mhaisen

Abdellatif

et al. 2022

IEEE Commun. Surv. Tutorials

View full text Add to dashboard Cite

Artificial intelligence (AI) has witnessed a substantial breakthrough in a variety of Internet of Things (IoT) applications and services, spanning from recommendation systems and speech processing applications to robotics control and military surveillance. This is driven by the easier access to sensory data and the enormous scale of pervasive/ubiquitous devices that generate zettabytes of real-time data streams. Designing accurate models using such data streams, to revolutionize the decision-taking process, inaugurates pervasive computing as a worthy paradigm for a better quality-of-life (e.g., smart homes and self-driving cars.). The confluence of pervasive computing and artificial intelligence, namely Pervasive AI, expanded the role of ubiquitous IoT systems from mainly data collection to executing distributed computations with a promising alternative to centralized learning, presenting various challenges, including privacy and latency requirements. In this context, an intelligent resource scheduling should be envisaged among IoT devices (e.g., smartphones, smart vehicles) and infrastructure (e.g., edge nodes and base stations) to avoid communication and computation overheads and ensure maximum performance. In this paper, we conduct a comprehensive survey of the recent techniques and strategies developed to overcome these resource challenges in pervasive AI systems. Specifically, we first present an overview of the pervasive computing, its architecture, and its intersection with artificial intelligence. We then review the background, applications and performance metrics of AI, particularly Deep Learning (DL) and reinforcement learning, running in a ubiquitous system. Next, we provide a deep literature review of communication-efficient techniques, from both algorithmic and system perspectives, of distributed training and inference across the combination of IoT devices, edge devices and cloud servers. Finally, we discuss our future vision and research challenges.

show abstract

Section: Pervasive Framework For Aimentioning

confidence: 99%

Pervasive AI for IoT Applications: A Survey on Resource-Efficient Distributed Artificial Intelligence

Baccour

Mhaisen

Abdellatif

et al. 2022

IEEE Commun. Surv. Tutorials

View full text Add to dashboard Cite

show abstract

“…The training dataset is divided into mini-lots, and all mini-lots are traversed by tuning the terms in each mini-lot. A tour of all mini-lots corresponds to an epoch [40,41].…”

Section: Descending Gradient With Mini-lotsmentioning

confidence: 99%

A Fuzzy Logic Model for Hourly Electrical Power Demand Modeling

et al. 2021

View full text Add to dashboard Cite

In this article, a fuzzy logic model is proposed for more precise hourly electrical power demand modeling in New England. The issue that exists when considering hourly electrical power demand modeling is that these types of plants have a large amount of data. In order to obtain a more precise model of plants with a large amount of data, the main characteristics of the proposed fuzzy logic model are as follows: (1) it is in accordance with the conditions under which a fuzzy logic model and a radial basis mapping model are equivalent to obtain a new scheme, (2) it uses a combination of the descending gradient and the mini-lots approach to avoid applying the descending gradient to all data.

show abstract

“…To evaluate the performance of the proposed model, we and an output layer that predicts values via the fully connected layer. Besides, LeNet-5 works well with handwritten datasets [37], it also reduces the number of parameters and can automatically learn features from raw pixels [38].…”

mentioning

confidence: 99%

A Generic Performance Model for Deep Learning in a Distributed Environment

Kavarakuntla,

Han,

Lloyd

et al. 2024

IEEE Access

View full text Add to dashboard Cite

Performance modelling of a deep learning application is essential to improve and quantify the efficiency of the model framework. However, existing performance models are mostly case-specific, with limited capability for the new deep learning frameworks/applications. In this paper, we propose a generic performance model of an application in a distributed environment with a generic expression of the application execution time that considers the influence of both intrinsic factors/operations (e.g. algorithmic parameters/internal operations) and extrinsic scaling factors (e.g. the number of processors, data chunks and batch size). We formulate it as a global optimisation problem and solve it using regularisation on a cost function and differential evolution algorithm to find the best-fit values of the constants in the generic expression to match the experimentally determined computation time. We have evaluated the proposed model on three deep learning frameworks (i.e., TensorFlow, MXnet, and Pytorch). The experimental results show that the proposed model can provide accurate performance predictions and interpretability. In addition, the proposed work can be applied to any distributed deep neural network without instrumenting the code and provides insight into the factors affecting performance and scalability.

show abstract

Hardware Resource Analysis in Distributed Training with Edge Devices

Cited by 4 publications

References 23 publications

Pervasive AI for IoT Applications: A Survey on Resource-Efficient Distributed Artificial Intelligence

Pervasive AI for IoT Applications: A Survey on Resource-Efficient Distributed Artificial Intelligence

A Fuzzy Logic Model for Hourly Electrical Power Demand Modeling

A Generic Performance Model for Deep Learning in a Distributed Environment

Contact Info

Product

Resources

About