ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems

Huai, Shuo; Zhang, Lei; Liu, Di; Subramaniam, Ravi

doi:10.1109/dac18074.2021.9586309

Cited by 13 publications

(9 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, the completely pruned model will be trained from scratch as described in Section IV-A for the final accuracy. Compared to global search algorithms [21] that directly search the huge design space for the optimal solution, our heuristic pruning algorithm greatly reduces the exploration overhead.…”

Section: Heuristic Architecture Descentmentioning

confidence: 99%

Towards Efficient Convolutional Neural Network for Embedded Hardware via Multi-Dimensional Pruning

Kong,

Liu,

Luo

et al. 2023

2023 60th ACM/IEEE Design Automation Conference (DAC)

Self Cite

View full text Add to dashboard Cite

show abstract

Section: Heuristic Architecture Descentmentioning

confidence: 99%

Towards Efficient Convolutional Neural Network for Embedded Hardware via Multi-Dimensional Pruning

Kong,

Liu,

Luo

et al. 2023

2023 60th ACM/IEEE Design Automation Conference (DAC)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Constrained by limited resources, these systems struggle to meet latency requirements, often requiring compromises that can entail frame drops and a potential reduction in overall analytical quality. Recently, using less capable yet lightweight DNN models has been introduced as a viable solution [3], [31], [32], [39], allowing the systems to match the inference rate with the frame rate. However, the customized models present a new challenge known as data drift, where live video data diverges from the training data, consequently reducing the accuracy of the lightweight models.…”

Section: A Video Analytics In Autonomous Systemsmentioning

confidence: 99%

A deep learning model to predict recurrence of atrial fibrillation after pulmonary vein isolation

Kim

Oh³

et al. 2020

Int J Arrhythm

View full text Add to dashboard Cite

Background and Objectives The efficacy of radiofrequency catheter ablation (RFCA) in atrial fibrillation (AF) is well established. The standard approach to RFCA in AF is pulmonary vein isolation (PVI). However, a large proportion of patients experiences recurrence of atrial tachyarrhythmia. The purpose of this study is to find out whether the AI model can assess AF recurrence in patients who underwent PVI. Materials and methods This study was a retrospective cohort study that enrolled consecutive patients who underwent catheter ablation for symptomatic, drug-refractory AF and PVI. We developed an AI algorithm to predict recurrence of AF after PVI using patient demographics and three-dimensional (3D) reconstructed left atrium (LA) images. Results We included 527 consecutive patients in the study. The overall mean LA diameter was 42.0 ± 6.8 mm, and the mean LA volume calculated using 3D reconstructed images was 151.1 ± 46.7 ml. During the follow-up period, atrial tachyarrhythmia recurred in 158 patients. The area under the curve (AUC) of the AI model based on a convolutional neural network (including 3D reconstruction images) was 0.61 (95% confidence interval [CI] 0.53–0.74) using the test dataset. The total test accuracy was 66.3% (57.0–75.6), and the sensitivity was 53.3% (34.8–71.9). The specificity was 73.2% (51.8–75.0), and the F1 score was 52.5% 34.5–66.7). Conclusion In this study, we developed an AI algorithm to predict recurrence of AF after catheter ablation of PVI using individual reconstructed LA images. This AI model was unable to predict recurrence of AF overwhelmingly; therefore, further large-scale study is needed.

show abstract

“…A latency predictor is necessary for a highly efficient model optimization process under latency constraints. During model architecture optimization, the model size is dynamically adjusted [40,41]. Thus, it is unfeasible to exhaustively include all potential model architecture configurations within a table.…”

Section: Efficient and Highly-accurate Latency Predictionmentioning

confidence: 99%

“…The demand for low-power edge intelligence is rapidly increasing in everyday devices like mobile phones and wearable gadgets [112]. However, the efficient deployment of DNNs on low-power devices is hindered by their growing need for computational ability and memory resources [41]. DNN algorithms exhibit a high degree of computing parallelism but require large memory access, thus, the ReRAM-based IMP crossbar architecture is an emerging and promising solution for efficiently accelerating these DNN algorithms.…”

Section: Overviewmentioning

confidence: 99%

Enabling efficient edge intelligence: a hardware-software codesign approach

Huai

View full text Add to dashboard Cite

After optimizing the model algorithm, we propose a hardware-aware collaborative training framework based on Federated Learning (FL), which can expand the training dataset for higher accuracy. Besides, it can learn heterogeneous models to meet the latency constraints of multiple edge systems simultaneously. We use our highaccuracy dynamic zeroizing-recovering method to adjust each local model under its latency constraint. A proto-corrected aggregation scheme is further designed to aggregate all heterogeneous local models, satisfying the latency constraint of different systems with one training process and maintaining high accuracy.However, in scenarios that demand extremely low power consumption and high throughput, emerging accelerators are needed to further optimize edge intelligence.IMP architecture is promising for DNN inference. To meet resource constraints and minimize power consumption in IMP devices, we use filter-group pruning and crossbar pruning to reduce crossbar usage without extra hardware units for data aligning. Besides, we adopt the non-ideality adaptation and self-compensation scheme to relieve the impact of non-ideality by exploiting the feature of crossbars without large hardware overhead. Finally, we integrate them into one training process for co-optimization, which improves the accuracy of the final model.In summary, we achieve efficient edge intelligence by optimizing DNN algorithms, training data, and computing devices, encompassing both software and hardware aspects. This unlocks the potential of edge intelligence, ensuring data privacy, achieving high accuracy, and keeping significant throughput across various applications. In the future, we will continue focusing on hardware-software co-design for edge intelligence. First, we intend to develop a dynamic reconfiguration architecture, which can seamlessly switch IMP cells between memory and computing functions, to optimally allocate memory and computing resources for enhancing xiii DNN inference efficiency. Second, we will design IMP accelerators to support various algorithms like Transformer. It will co-optimize algorithms, data, and IMP devices, aiming to comprehensively advance the capabilities and applications of edge intelligence. Third, we will propose a hybrid CNN-Transformers Neural Architecture Search (NAS) framework for the IMP architecture to achieve hardware friendliness, high accuracy, high robustness, low latency, and low power consumption IMP-based edge intelligence.

show abstract

ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems

Cited by 13 publications

References 15 publications

Towards Efficient Convolutional Neural Network for Embedded Hardware via Multi-Dimensional Pruning

Towards Efficient Convolutional Neural Network for Embedded Hardware via Multi-Dimensional Pruning

A deep learning model to predict recurrence of atrial fibrillation after pulmonary vein isolation

Enabling efficient edge intelligence: a hardware-software codesign approach

Contact Info

Product

Resources

About