DeepIoT

Yao, Shuochao; Zhao, Yiran; Zhang, Aston; Lu, Shan; Abdelzaher, Tarek

doi:10.1145/3131672.3131675

Cited by 150 publications

(17 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some frameworks for executing neural networks distributed into several IoT devices have been proposed [13], [15], [27], [28], [29], [30]. Sze et al [22] reviewed methods for efficiently executing DNNs, focusing on the inference phase, hardware platforms, and architecture for supporting DNNs.…”

Section: A Machine Learning Frameworkmentioning

confidence: 99%

“…DeepIoT [15] compresses CNNs, fully connected neural networks, and Recurrent Neural Networks by extracting redundant neurons. This compression can significantly reduce the DNN size, execution time, and energy consumption without loss of accuracy.…”

Section: A Machine Learning Frameworkmentioning

confidence: 99%

“…Existing frameworks that distribute a CNN on IoT devices limit the partitioning into layers [13], [15], [16]. However, this approach might lead to suboptimal results or memoryinvalid partitionings, in which at least one partition needs more memory than the devices provide [14], [17].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

PANCODE: Multilevel Partitioning of Neural Networks for Constrained Internet-of-Things Devices

et al. 2023

View full text Add to dashboard Cite

The increasing number of Internet-of-Things (IoT) devices will generate unprecedented data in the upcoming years. Fog computing may prevent the saturation of the network infrastructure by processing data at the edge or within these devices. Consequently, the machine intelligence built almost exclusively on the cloud can be scattered to the edge devices. While deep learning techniques can adequately process IoT-massive data volumes, their high resource-demanding nature poses a trade-off for execution on resourceconstrained devices. This paper proposes and evaluates the performance of the PArtitioning Networks for COnstrained DEvices (PANCODE), a novel algorithm that employs a multilevel approach to partition large convolutional neural networks for distributed execution on constrained IoT devices. Experimental results with the LeNet and AlexNet models show that our algorithm can produce partitionings that achieve up to 2173.53 times more inferences per second than the Best Fit algorithm and up to 1.37 times less communication than the second-best approach. We also show that the METIS state-of-the-art framework only produces invalid partitionings in more constrained setups. The results indicate that our algorithm achieves higher inference rates and low communication costs in convolutional neural networks distributed among constrained and exceptionally very constrained devices.

show abstract

Section: A Machine Learning Frameworkmentioning

confidence: 99%

Section: A Machine Learning Frameworkmentioning

confidence: 99%

See 1 more Smart Citation

PANCODE: Multilevel Partitioning of Neural Networks for Constrained Internet-of-Things Devices

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Its downside, common to many DL compression techniques, is a permanent decrease in inference accuracy (of ≈ 5%). On the pruning front, PatDNN enables real-time inference using large-scale DL models (e.g., VGG-16, ResNet-50) on mobile devices by harnessing pattern-based model pruning [33], while DeepIoT [48] uses reinforcement learning to guide the pruning process. Both solutions lead to significant model size reductions (90% to 98.9% in case of DeepIoT) and speedups (up to 44.5× in case of PatDNN) with no inference accuracy degradation in certain settings, demonstrating vast opportunities for mobile DL optimisation.…”

Section: Related Workmentioning

confidence: 99%

Mobiprox: Supporting Dynamic Approximate Computing on Mobiles

Fabjančič¹,

Machidon²,

Sharif³

et al. 2023

Preprint

View full text Add to dashboard Cite

Runtime-tunable context-dependent network compression would make mobile deep learning adaptable to often varying resource availability, input "difficulty", or user needs. The existing compression techniques significantly reduce the memory, processing, and energy tax of deep learning, yet, the resulting models tend to be permanently impaired, sacrificing the inference power for reduced resource usage. The existing tunable compression approaches, on the other hand, require expensive re-training, seldom provide mobile-ready implementations, and do not support arbitrary strategies for adapting the compression.In this paper we present Mobiprox, a framework enabling flexible-accuracy on-device deep learning. Mobiprox implements tunable approximations of tensor operations and enables runtime adaptation of individual network layers. A profiler and a tuner included with Mobiprox identify the most promising neural network approximation configurations leading to the desired inference quality with the minimal use of resources. Furthermore, we develop control strategies that depending on contextual factors, such as the input data difficulty, dynamically adjust the approximation level of a model. We implement Mobiprox in Android OS and through experiments in diverse mobile domains, including human activity recognition and spoken keyword detection, demonstrate that it can save up to 15% system-wide energy with a minimal impact on the inference accuracy.CCS Concepts: • Human-centered computing → Ubiquitous and mobile computing; • Computing methodologies → Neural networks.

show abstract

“…Firstly, it has become common for distributed applications to be configured for use on mobile [26,30,34] or embedded [10] devices with on the order of GBs of memory and storage, but they are both very energyhungry and expensive at the scale of our application. Another approach is to reduce the required memory and computation by simplifying the models used, for instance in deep neural nets [24,42,43]. However, in cases like ours where mobile phones are too expensive and power hungry and the problem can't be adequately simplified, the use of a co-processor comes to mind.…”

Section: Related Workmentioning

confidence: 99%

Wisdom

Winkler

Cerpa

2019

Proceedings of the 17th Conference on Embedded Networked Sensor Systems

View full text Add to dashboard Cite

As lawn irrigation is estimated to consume 7 billion gallons of scarce fresh water each day in North America alone, lawn irrigation systems are a high priority for improvements in efficiency. To this end, recent work has introduced several key advancements in irrigation control. Distributed actuation systems allow the irrigation system to apply water completely independently across the field allowing flexibility of control, and the use of fluid flow modeling and optimization allows more efficient schedules to be computed automatically, significantly improving the irrigation quality of service as well. However, the proposed systems are designed with centralized architectures that introduce single points of failure, computational bottlenecks in data processing, and significant network energy for data forwarding used by the centralized datadriven modeling strategies. In response to these challenges, we propose and demonstrate WISDOM, whose novel and flexible hardware and processing pipelines enable the use of a distributed system for the management of irrigation systems of any scale, with energy independence by way of energy harvesting. Across 4 weeks of live system deployment, we find that the WISDOM system can save up to 32.9% of water in comparison to industry-best, while maintaining a perfect quality of service to the plant. Furthermore, with substantial analysis in simulation we find that in addition to practical system improvements, the use of the proposed distributed system within typical operating conditions will provide all of the efficiency and quality-of-service benefits of the globally-modeled, centrally controlled systems, while allowing the robust control of irrigation systems of any size.

show abstract

DeepIoT

Cited by 150 publications

References 52 publications

PANCODE: Multilevel Partitioning of Neural Networks for Constrained Internet-of-Things Devices

PANCODE: Multilevel Partitioning of Neural Networks for Constrained Internet-of-Things Devices

Mobiprox: Supporting Dynamic Approximate Computing on Mobiles

Wisdom

Contact Info

Product

Resources

About