TOD: Transprecise Object Detection to Maximise Real-Time Accuracy on the Edge

Lee, Junkyu; Varghese, Blesson; Woods, Roger; Vandierendonck, Hans

doi:10.1109/icfec51620.2021.00015

Cited by 11 publications

(21 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There has been a considerable body of work on exploring runtime adaptation of NNs according to given compute resource constraints [1,[25][26][27][28][29]. The single NestDNN NN model [1] switches between multiple capacities of the NN during runtime according to accuracy and inference latency requirements.…”

Section: Runtime Adaptation Of Nnsmentioning

confidence: 99%

“…Similarly, Yu et al [29] proposes the Slimmable Neural Network, in which the filter parameters are shared from a smaller capacity model to increase the capacity of the NN. Another study [25] proposes to use a runtime decision mechanism to switch between multiple NNs dynamically, according to video content and computational latency, in order to improve the real-time object detection accuracy.…”

Section: Runtime Adaptation Of Nnsmentioning

confidence: 99%

See 1 more Smart Citation

Increased Leverage of Transprecision Computing for Machine Vision Applications at the Edge

Minhas

Lee

Mukhanov

et al. 2022

J Sign Process Syst

Self Cite

View full text Add to dashboard Cite

The practical deployment of machine vision presents particular challenges for resource constrained edge devices. With a clear need to execute multiple tasks with variable workloads, there is a need for a robust approach that can dynamically adapt at runtime and which can maintain the maximum quality of service (QoS) within the available resource constraints. A lightweight approach that monitors the runtime workload constraints and leverages accuracy-throughput trade-offs on a graphics processing unit (GPU), is presented. It includes optimisation techniques that identify the configurations for each task in terms of optimal accuracy, energy and memory and management of the transparent switching between configurations. Using a neural network architecture search that statically generates a range of implementations that target a resource-precision trade-off, we explore the detection of the optimal parameters for the required QoS under specific memory and energy constraints. For an accuracy loss of 1%, we demonstrate that a $$1.6\times$$ 1.6 × higher frame processing rate can be achieved on GPU with further improvements possible at further relaxed accuracy. In order to further improve the switching between configurations, we enhance the proposed mechanism by employing central processing units (CPUs) for offloading some of the executed frames, which helps to improve the frame rate by further 0.9%.

show abstract

Section: Runtime Adaptation Of Nnsmentioning

confidence: 99%

Section: Runtime Adaptation Of Nnsmentioning

confidence: 99%

Increased Leverage of Transprecision Computing for Machine Vision Applications at the Edge

Minhas

Lee

Mukhanov

et al. 2022

J Sign Process Syst

Self Cite

View full text Add to dashboard Cite

show abstract

“…As shown in [8], on average about 83% of runtime computations for many machine learning and deep learning applications can be approximated. This can lead to substantial savings in power consumption, for instance in [9], the authors saved 63% of the power consumption by using approximations. This adaptive approximation can be performed on the hardware, e.g., using Dynamic Partial Reconfiguration (DPR) and by instantiating of different arithmetic hardware units [10], or using dynamic change of the frequency of operations [11].…”

Section: Energy-aware Approximate Deep Learningmentioning

confidence: 99%

“…This adaptive approximation can be performed on the hardware, e.g., using Dynamic Partial Reconfiguration (DPR) and by instantiating of different arithmetic hardware units [10], or using dynamic change of the frequency of operations [11]. It could also happen at the software level, e.g., by changing the utilized machine learning algorithm based on input data, such as [9] and [12], or by changing the memory access policy used in [13]. The benefits of these approaches can be up to 2.5× savings in the energy consumption of the system.…”

Section: Energy-aware Approximate Deep Learningmentioning

confidence: 99%

Energy-aware Adaptive Approximate Computing for Deep Learning Applications

Nima

Salar

2022

2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

View full text Add to dashboard Cite

Application that use deep learning incur a substantial amount of energy consumption. Reducing this energy footprint is important, especially for applications such as Internet of Things (IoT) Embedded Systems (ESs), where resources are scarce. Here, we present computational self-awareness as a promising solution for intelligently adapt machine learning algorithms at runtime to reduce their energy consumption. In particular, we focus on approximation as a key enabler knob for such adaptivity. We show that the benefits of such an approach can be up to 2.5× energy savings.

show abstract

“…Consider the example of a real-time video analytics application, such as identifying objects on different frames of a video stream. A different DNN model from a portfolio of models can be employed for each frame to maximise the accuracy of detection [LVWV21]. This is achieved by leveraging the meta-characteristics of each video frame, such as the size of the object and the speed of movement of the object.…”

Section: The Effect Of Workload Patternsmentioning

confidence: 99%

Toward Sustainable Serverless Computing

Patros

Spillner

Papadopoulos

et al. 2021

IEEE Internet Comput.

Self Cite

View full text Add to dashboard Cite

Although serverless computing generally involves executing short-lived "functions", the increasing migration to this computing paradigm requires careful consideration of energy and power requirements. Serverless computing is also viewed as an economically-driven computational approach, often influenced by the cost of computation, as users are charged for per-sub-second use of computational resources rather than the coarse-grained charging that is common with virtual machines and containers. To ensure that the startup times of serverless functions do not discourage their use, resource providers need to keep these functions hot, often by passing in synthetic data. We describe the real power consumption characteristics of serverless, based on execution traces reported in the literature, and describe potential strategies (some adopted from existing VM and container-based approaches) that can be used to reduce the energy overheads of serverless execution. Our analysis is, purposefully, biased towards the use of machine learning workloads as: (i) such workloads are increasingly being used widely across different applications; (ii) functions that implement machine learning algorithms can range in complexity from long-running (deep learning) vs. short-running (inference only), enabling us to consider serverless across a variety of possible execution behaviours. The general findings are also easily translatable to other domains.

show abstract

TOD: Transprecise Object Detection to Maximise Real-Time Accuracy on the Edge

Cited by 11 publications

References 24 publications

Increased Leverage of Transprecision Computing for Machine Vision Applications at the Edge

Increased Leverage of Transprecision Computing for Machine Vision Applications at the Edge

Energy-aware Adaptive Approximate Computing for Deep Learning Applications

Toward Sustainable Serverless Computing

Contact Info

Product

Resources

About