2021 58th ACM/IEEE Design Automation Conference (DAC) 2021
DOI: 10.1109/dac18074.2021.9586309
|View full text |Cite
|
Sign up to set email alerts
|

ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems

Abstract: Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neur… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…Finally, the completely pruned model will be trained from scratch as described in Section IV-A for the final accuracy. Compared to global search algorithms [21] that directly search the huge design space for the optimal solution, our heuristic pruning algorithm greatly reduces the exploration overhead.…”
Section: Heuristic Architecture Descentmentioning
confidence: 99%
“…Finally, the completely pruned model will be trained from scratch as described in Section IV-A for the final accuracy. Compared to global search algorithms [21] that directly search the huge design space for the optimal solution, our heuristic pruning algorithm greatly reduces the exploration overhead.…”
Section: Heuristic Architecture Descentmentioning
confidence: 99%
“…Constrained by limited resources, these systems struggle to meet latency requirements, often requiring compromises that can entail frame drops and a potential reduction in overall analytical quality. Recently, using less capable yet lightweight DNN models has been introduced as a viable solution [3], [31], [32], [39], allowing the systems to match the inference rate with the frame rate. However, the customized models present a new challenge known as data drift, where live video data diverges from the training data, consequently reducing the accuracy of the lightweight models.…”
Section: A Video Analytics In Autonomous Systemsmentioning
confidence: 99%
“…A latency predictor is necessary for a highly efficient model optimization process under latency constraints. During model architecture optimization, the model size is dynamically adjusted [40,41]. Thus, it is unfeasible to exhaustively include all potential model architecture configurations within a table.…”
Section: Efficient and Highly-accurate Latency Predictionmentioning
confidence: 99%
“…The demand for low-power edge intelligence is rapidly increasing in everyday devices like mobile phones and wearable gadgets [112]. However, the efficient deployment of DNNs on low-power devices is hindered by their growing need for computational ability and memory resources [41]. DNN algorithms exhibit a high degree of computing parallelism but require large memory access, thus, the ReRAM-based IMP crossbar architecture is an emerging and promising solution for efficiently accelerating these DNN algorithms.…”
Section: Overviewmentioning
confidence: 99%