Shuo Huai scite author profile

A new trend tends to deploy deep learning algorithms to edge environments to mitigate privacy and latency issues from cloud computing. Diverse edge deep learning accelerators are devised to speed up the inference of deep learning algorithms on edge devices. Various edge deep learning accelerators feature different characteristics in terms of power and performance, which make it a very challenging task to efficiently and uniformly compare different accelerators. In this paper, we introduce EDLAB, an end-to-end benchmark, to evaluate the overall performance of edge deep learning accelerators. EDLAB consists of state-of-the-art deep learning models, a unified workload preprocessing and deployment framework, as well as a collection of comprehensive metrics.In addition, we propose parameterized models to model the hardware performance bound so that EDLAB can identify the hardware potentials and the hardware utilization of different deep learning applications. Finally, we employ EDLAB to benchmark three edge deep learning accelerators and analyze the benchmarking results. From the analysis we obtain some insightful observations that can guide the design of efficient deep learning applications.

show abstract

HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search

Luo

Liu

Huai

et al. 2021

View full text Add to dashboard Cite

show abstract

ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems

Huai

Zhang

Liu

et al. 2021

View full text Add to dashboard Cite

Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neural networks training method to reduce the model latency on latency-critical edge systems. A latency predictor is also introduced to guide and optimize this procedure. Coupled with the latency predictor, our method can guarantee the latency for a compact model by only one training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the 'hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy a 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3%.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shuo Huai

LightNAS: On Lightweight and Scalable Neural Architecture Search for Embedded Platforms

Latency-constrained DNN architecture learning for edge systems using zerorized batch normalization

EDLAB: A Benchmark for Edge Deep Learning Accelerators

HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search

ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems

Contact Info

Product

Resources

About