FlexDNN: Input-Adaptive On-Device Deep Learning for Efficient Mobile Vision

Fang, Biyi; Zeng, Xiao; Zhang, Faen; Xu, Hui; Zhang, Mi

doi:10.1109/sec50012.2020.00014

Cited by 41 publications

(25 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Accuracy is the top-1 accuracy of the model execution and latency is measured as the end-to-end delay in processing an input (e.g., an image). We use GPU frequency to quantify the power of edge device, which is reasonable because there usually exists approximately linear correlations between power and frequency [7,35]. We use SR to represent the percentage of executions that satisfy the latency and energy constraints imposed by users.…”

Section: Methodsmentioning

confidence: 99%

“…Although above techniques reduce the overhead of DNN execution, they are not designed to optimize the performance of DNN in the presence of dynamics of input data and resource budget. Extending the static model compression approach, several efforts [7,8,27] proposed dynamic neural networks that allow selective execution to improve DNN compute efficiency. D2NN [27] optimizes dynamic resource-accuracy trade-offs, while its complicated network structure incurs significant memory overhead, making it ill-suited for resource-constrained platforms.…”

Section: Related Workmentioning

confidence: 99%

“…With the calculated reward ( , ) and the observed state +1 caused by action , the actor model aims to generate new action +1 = ( +1 ) such that the new expected reward caused by +1 is maximized, where the new expected reward ( , ) is calculated by the critical model according to Eq. (7). With the feedback from the reward function ( , ), both the actor model and the critical model are trained in a back propagation process from the approach in [26].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

EdgeML

Wang

Ling

Xing

2021

Proceedings of the International Conference on Internet-of-Things Design and Implementation

View full text Add to dashboard Cite

In recent years, deep learning algorithms are increasingly adopted by a wide range of data-intensive and time-critical Internet of Things (IoT) applications. As a result, several new approaches, including model partition/offloading and progressive neural architecture, have been proposed to address the challenge of deploying the computationintensive deep neural network (DNN) models on resource-constrained edge devices. However, the performance of existing approaches is highly affected by runtime dynamics. For example, offloading workload from edge to cloud suffers from communication delays and the efficiency of progressive neural architecture supporting early-exit DNN executions relies on input characteristics. In this paper, we introduce EdgeML, an AutoML framework that provides flexible and fine-grained DNN model execution control by combining workload offloading mechanism and dynamic progressive neural architecture. To achieve desirable latency-accuracy-energy system performance on edge platforms, EdgeML adopts reinforcement learning to automatically update model execution policy in response to runtime dynamics in real-time. We implement EdgeML for several widely used DNN models on the latest edge devices. Comparing to existing approaches, our experiments show that EdgeML achieves up to 8× performance improvement under dynamic environments. CCS CONCEPTS• Computer systems organization → Real-time system architecture; • Computing methodologies → Neural networks.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

EdgeML

Wang

Ling

Xing

2021

Proceedings of the International Conference on Internet-of-Things Design and Implementation

View full text Add to dashboard Cite

show abstract

“…Initially, one needs to pick the architecture of the early-exit model. There are largely two avenues followed in the literature: i) hand-tuned end-to-end designed networks for early-exiting, such as MSDNet [30], and ii) vanilla backbone networks, enhanced with early exits along their depth [12,35,41,68]. This design choice is crucial as it later affects the capacity and the learning process of the network, with different architectures offering varying scalability potential and convergence dynamics.…”

Section: Designing the Architecturementioning

confidence: 99%

“…SPINN [42] Vision/Classification Partial inference offloading of EE-networks. FlexDNN [12] Vision/Classification Footprint overhead-aware design of EE-networks. DDI [76] Vision/Classification Combines layer/channel skipping with early exiting.…”

Section: Early-exiting Network-agnostic Techniquesmentioning

confidence: 99%

Adaptive Inference through Early-Exit Networks

Laskaridis

Kouris

Lane

2021

Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning

View full text Add to dashboard Cite

DNNs are becoming less and less over-parametrised due to recent advances in efficient model design, through careful hand-crafted or NAS-based methods. Relying on the fact that not all inputs require the same amount of computation to yield a confident prediction, adaptive inference is gaining attention as a prominent approach for pushing the limits of efficient deployment. Particularly, early-exit networks comprise an emerging direction for tailoring the computation depth of each input sample at runtime, offering complementary performance gains to other efficiency optimisations. In this paper, we decompose the design methodology of early-exit networks to its key components and survey the recent advances in each one of them. We also position early-exiting against other efficient inference solutions and provide our insights on the current challenges and most promising future directions for research in the field. CCS CONCEPTS• Computing methodologies → Neural networks; • Humancentered computing → Ubiquitous and mobile computing systems and tools.

show abstract

Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies

Wu¹,

Wang²,

Lu³

et al. 2022

Mach. Intell. Res.

View full text Add to dashboard Cite

Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs, particularly the modern deep neural networks (DNNs) and some brain-inspired methodologies, have largely boosted the recognition performance on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Although recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this survey, we present the review of recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related and brain-inspired visual recognition approaches, including efficient network compression and dynamic brain-inspired networks. We investigate not only from the model but also from the data point of view (which is not the case in existing surveys) and focus on four typical data types (images, video, points, and events). This survey attempts to provide a systematic summary via a comprehensive survey that can serve as a valuable reference and inspire both researchers and practitioners working on visual recognition problems.

show abstract

FlexDNN: Input-Adaptive On-Device Deep Learning for Efficient Mobile Vision

Cited by 41 publications

References 23 publications

EdgeML

EdgeML

Adaptive Inference through Early-Exit Networks

Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies

Contact Info

Product

Resources

About