2017 IEEE International Conference on Computer Design (ICCD) 2017
DOI: 10.1109/iccd.2017.49
|View full text |Cite
|
Sign up to set email alerts
|

A Dynamic Deep Neural Network Design for Efficient Workload Allocation in Edge Computing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
24
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 40 publications
(26 citation statements)
references
References 12 publications
2
24
0
Order By: Relevance
“…In DeepIns, edge devices are responsible for data collection, the edge server acts as the first exit point and the cloud data center acts as the second exit point. Then Lo et al [90] proposes adding an authentic operation (AO) unit to the basic BranchyNet model. The AO unit determines whether an input has to be transferred to the edge server or cloud data center for further execution by setting different threshold criteria of confidence level for different DNN model output classes.…”
Section: Enabling Technologiesmentioning
confidence: 99%
“…In DeepIns, edge devices are responsible for data collection, the edge server acts as the first exit point and the cloud data center acts as the second exit point. Then Lo et al [90] proposes adding an authentic operation (AO) unit to the basic BranchyNet model. The AO unit determines whether an input has to be transferred to the edge server or cloud data center for further execution by setting different threshold criteria of confidence level for different DNN model output classes.…”
Section: Enabling Technologiesmentioning
confidence: 99%
“…This is because the data dependency exists between each pair of DNN layers. Lo et al [16] presented a dynamic DNN design technique to manage the workload transmission under the same accuracy requirement in edge computing. They utilized the dynamic network structure and authentic operation (AO) unit to enhance DNNs, which had a better performance in terms of reducing the amount of workload transmission while achieving the required accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…Model compression, such as pruning [46], data quantization [47] and knowledge distillation [48], can minimize the model complexity to relieve the pressure of the end devices. Model early-exit [49], [50], model selection [51], model partition [52], [53] and input filtering [54], [55] can realize inference acceleration, further speeding up the deployment on edge devices with the limited memory.…”
Section: Model Establishmentmentioning
confidence: 99%