Partitioning DNNs for Optimizing Distributed Inference Performance on Cooperative Edge Devices: A Genetic Algorithm Approach

Na, Jun; Zhang, Handuo; Lian, Jiaxin; Zhang, Bin

doi:10.3390/app122010619

Cited by 6 publications

(3 citation statements)

References 41 publications

(54 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In turn, the camera will run the inference task assigned to it and send the corresponding output to the smartphone, which performs the second exit branch and obtains corresponding identification results. There are more similar application scenarios, such as distributed fall detection [ 8 ] and traffic prediction [ 9 ].…”

Section: Introductionmentioning

confidence: 99%

Genetic Algorithm-Based Online-Partitioning BranchyNet for Accelerating Edge Inference

Zhang

Lian

et al. 2023

Sensors

Self Cite

View full text Add to dashboard Cite

In order to effectively apply BranchyNet, a DNN with multiple early-exit branches, in edge intelligent applications, one way is to divide and distribute the inference task of a BranchyNet into a group of robots, drones, vehicles, and other intelligent edge devices. Unlike most existing works trying to select a particular branch to partition and deploy, this paper proposes a genetic algorithm (GA)-based online partitioning approach that splits the whole BranchyNet with all its branches. For this purpose, it establishes a new calculation approach based on the weighted average for estimating total execution time of a given BranchyNet and a two-layer chromosome GA by distinguishing partitioning and deployment during the evolution in GA. The experiment results show that the proposed algorithm can not only result in shorter execution time and lower device-average energy cost but also needs less time to obtain an optimal deployment plan. Such short running time enables the proposed algorithm to generate an optimal deployment plan online, which dynamically meets the actual requirements in deploying an intelligent application in the edge.

show abstract

Section: Introductionmentioning

confidence: 99%

Genetic Algorithm-Based Online-Partitioning BranchyNet for Accelerating Edge Inference

Zhang

Lian

et al. 2023

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

“…Deploying DNN models on edge devices (e.g., embedded systems) presents various challenges, including the limited computational power and memory of edge devices, which can often prevent the deployment of large DNN models entirely. For example, Convolution Neural Networks (CNNs), another type of DNN, can be large and computationally intensive, which makes it challenging to deploy the entire CNN model on a single core edge device [6]. Traditionally, machine learning-based CPS applications were run sequentially on a singlecore processor or device [7].…”

Section: Introductionmentioning

confidence: 99%

“…For example, Chuang Hu et al [12] proposed a min-cut-based algorithm to partition and offload DNNs in both edge and cloud environments. Similarly, cloud and edge-assisted approaches [6,13] divide the DNNs into two parts to offer local and remote computation. However, previous studies have not addressed the ideal number of partitions for the DNN model concerning layer dependencies, device computing power, and communication latency.…”

Section: Introductionmentioning

confidence: 99%

Optimizing DNNs Model Partitioning for Enhanced Performance on Edge Devices

Maruf

Azim

2023

Proceedings of the Canadian Conference on Artificial Intelligence

View full text Add to dashboard Cite

Deep Neural Networks (DNNs) have proven effective in various applications due to their dominant performance. However, integrating DNNs into edge devices remains challenging due to the large size of the DNN model, which requires efficient model parallelization and workload partitioning. Previous attempts to address these challenges have focused on data and model parallelism but have fallen short in terms of finding the optimal DNN model partitions for efficient distribution, considering available resources.This paper presents a pipelined DNN model parallelism framework that improves the performance of DNNs on edge devices. The framework optimizes DNN model training by determining the optimal number of partitions based on available edge resources. This is achieved through a combination of data and model parallelism techniques, which efficiently distribute the workload across multiple processors to reduce training time. The framework also includes a task controller to manage computing resources effectively. The experimental results demonstrate the effectiveness of the proposed approach, showing a significant reduction in the model training time compared to a baseline model AlexNet.

show abstract

Optimizing DNN training with pipeline model parallelism for enhanced performance in embedded systems

Maruf,

Azim,

Auluck

et al. 2024

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

Partitioning DNNs for Optimizing Distributed Inference Performance on Cooperative Edge Devices: A Genetic Algorithm Approach

Cited by 6 publications

References 41 publications

Genetic Algorithm-Based Online-Partitioning BranchyNet for Accelerating Edge Inference

Genetic Algorithm-Based Online-Partitioning BranchyNet for Accelerating Edge Inference

Optimizing DNNs Model Partitioning for Enhanced Performance on Edge Devices

Optimizing DNN training with pipeline model parallelism for enhanced performance in embedded systems

Contact Info

Product

Resources

About