Adaptive deep learning model selection on embedded systems

Taylor, Benjamin M.; Marco, Vicent Sanz; Wolff, Willy; Elkhatib, Yehia; Wang, Zheng

doi:10.1145/3299710.3211336

Cited by 80 publications

(34 citation statements)

References 42 publications

(52 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Energy and power optimization for embedded and mobile systems is an intensely studied field. There is a wide range of activities on exploiting compiler-based code optimization [12], [13], runtime task scheduling [14], [15], or a combination of both [7] to optimize different workloads for energy efficiency. Other relevant work in web browsing optimization exploits application knowledge to batch network communications [16], [17], and parallel downloading [18], which primarily target the initial page loading phase.…”

Section: A Energy Optimizationmentioning

confidence: 99%

Using Machine Learning to Optimize Web Interactions on Heterogeneous Mobile Systems

et al. 2019

Self Cite

View full text Add to dashboard Cite

The web has become a ubiquitous application development platform for mobile systems. Yet, web access on mobile devices remains an energy-hungry activity. Prior work in the field mainly focuses on the initial page loading stage, but fails to exploit the opportunities for energy-efficiency optimization while the user is interacting with a loaded page. This paper presents a novel approach for performing energy optimization for interactive mobile web browsing. At the heart of our approach is a set of machine learning models, which estimate at runtime the frames per second for a given user interaction input by running the computation-intensive web render engine on a specific processor core under a given clock speed. We use the learned predictive models as a utility function to quickly search for the optimal processor setting to carefully trade responsive time for reduced energy consumption. We integrate our techniques to the opensource Chromium browser and apply it to two representative mobile user events: scrolling and pinching (i.e., zoom in and out). We evaluate the developed system on the landing pages of the top-100 hottest websites and two big.LITTLE heterogeneous mobile platforms. Our extensive experiments show that the proposed approach reduces the system-wide energy consumption by over 36% on average and up to 70%. This translates to an over 17% improvement on energy-efficiency over a state-of-the-art event-based web browser scheduler, but with significantly fewer violations on the quality of service.

show abstract

Section: A Energy Optimizationmentioning

confidence: 99%

Using Machine Learning to Optimize Web Interactions on Heterogeneous Mobile Systems

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…That is, a little and fast model is used to try to classify the input data and the big model is only used when the confidence of the little model is less than a predefined threshold. Taylor et al [103] points out that different DNN models (e.g., MobileNet, ResNet, Inception) reach lowest inference latency or highest accuracy on different evaluation metrics (top-1 or top-5) for different images. Then they propose a framework for selecting the best DNN in terms of latency and accuracy.…”

Section: Resultsmentioning

confidence: 99%

Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing

et al. 2019

View full text Add to dashboard Cite

With the breakthroughs in deep learning, the recent years have witnessed a booming of artificial intelligence (AI) applications and services, spanning from personal assistant to recommendation systems to video/audio surveillance. More recently, with the proliferation of mobile computing and Internet-of-Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions Bytes of data at the network edge. Driving by this trend, there is an urgent need to push the AI frontiers to the network edge so as to fully unleash the potential of the edge big data. To meet this demand, edge computing, an emerging paradigm that pushes computing tasks and services from the network core to the network edge, has been widely recognized as a promising solution.The resulted new inter-discipline, edge AI or edge intelligence, is beginning to receive a tremendous amount of interest. However, research on edge intelligence is still in its infancy stage, and a dedicated venue for exchanging the recent advances of edge intelligence is highly desired by both the computer system and artificial intelligence communities. To this end, we conduct a comprehensive survey of the recent research efforts on edge intelligence. Specifically, we first review the background and motivation for artificial intelligence running at the network edge. We then provide an overview of the overarching architectures, frameworks and emerging key technologies for deep learning model towards training/inference at the network edge. Finally, we discuss future research opportunities on edge intelligence. We believe that this survey will elicit escalating attentions, stimulate fruitful discussions and inspire further research ideas on edge intelligence.

show abstract

“…Predictive Modeling. Recent studies have shown that machine learning based predictive modeling is effective in code optimization [43], [44], performance predicting [45], [46], parallelism mapping [20], [47], [48], [49], [50], and task scheduling [51], [52], [53], [54], [55], [56]. Its great advantage is its ability to adapt to the ever-changing platforms as it has no prior assumption about their behavior.…”

Section: Domain-specific Optimizationsmentioning

confidence: 99%

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures

Zhang

Fang

Yang

et al. 2020

IEEE Trans. Parallel Distrib. Syst.

Self Cite

View full text Add to dashboard Cite

As many-core accelerators keep integrating more processing units, it becomes increasingly more difficult for a parallel application to make effective use of all available resources. An effective way for improving hardware utilization is to exploit spatial and temporal sharing of the heterogeneous processing units by multiplexing computation and communication tasks -a strategy known as heterogeneous streaming. Achieving effective heterogeneous streaming requires carefully partitioning hardware among tasks, and matching the granularity of task parallelism to the resource partition. However, finding the right resource partitioning and task granularity is extremely challenging, because there is a large number of possible solutions and the optimal solution varies across programs and datasets. This article presents an automatic approach to quickly derive a good solution for hardware resource partition and task granularity for task-based parallel applications on heterogeneous many-core architectures. Our approach employs a performance model to estimate the resulting performance of the target application under a given resource partition and task granularity configuration. The model is used as a utility to quickly search for a good configuration at runtime. Instead of hand-crafting an analytical model that requires expert insights into low-level hardware details, we employ machine learning techniques to automatically learn it. We achieve this by first learning a predictive model offline using training programs. The learnt model can then be used to predict the performance of any unseen program at runtime. We apply our approach to 39 representative parallel applications and evaluate it on two representative heterogeneous many-core platforms: a CPU-XeonPhi platform and a CPU-GPU platform. Compared to the single-stream version, our approach achieves, on average, a 1.6x and 1.1x speedup on the XeonPhi and the GPU platform, respectively. These results translate to over 93% of the performance delivered by a theoretically perfect predictor.

show abstract

Adaptive deep learning model selection on embedded systems

Cited by 80 publications

References 42 publications

Using Machine Learning to Optimize Web Interactions on Heterogeneous Mobile Systems

Using Machine Learning to Optimize Web Interactions on Heterogeneous Mobile Systems

Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures

Contact Info

Product

Resources

About