Memory- and Communication-Aware Model Compression for Distributed Deep Learning Inference on IoT

Bhardwaj, Kartikeya; Lin, Ching‐Yi; Sartor, Anderson L.; Mărculescu, Radu

doi:10.48550/arxiv.1907.11804

Cited by 2 publications

(10 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With ever-more devices available on the edge, this network of devices must be exploited to improve model accuracy without increasing the communication latency. Towards this, we describe our recent work on Network-of-Neural Networks (NoNN) [1] for memoryand communication-aware model compression.…”

Section: Communication-aware Model Compressionmentioning

confidence: 99%

“…These patterns of activations reveal how knowledge learned by the teacher network is distributed at the final convolution layer. Therefore, we first use these patterns to create a filter activation network [1] which represents how knowledge is distributed across multiple filters (see Fig. 3(b)).…”

Section: Network-of-neural Networkmentioning

confidence: 99%

“…However, due to their enormous computational complexity, deploying such models on constrained devices has emerged as a critical bottleneck for large-scale adoption of intelligence at the IoT edge. It has been estimated that the number of connected IoT-devices will reach one trillion across various market segments by 2035 1 ; this provides us a unique opportunity for integrating widespread intelligence in edge devices. Such an exponential growth in IoT-devices necessitates new breakthroughs in Artificial Intelligence research that can more effectively deploy learning at the edge and, therefore, truly exploit the setup of trillions of IoT-devices.…”

Section: Introductionmentioning

confidence: 99%

“…Finally, since IoT naturally implies a network of connected devices, it automatically opens the door to a new class of problems -communication-aware model compression [1]. For instance, many smart home/cities applications can have numerous connected IoT-sensors with, say, 500KB total memory per node.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

EdgeAl: A Vision for Deep Learning in the IoT Era

2021

Self Cite

View full text Add to dashboard Cite

The significant computational requirements of deep learning present a major bottleneck for its large-scale adoption on hardwareconstrained IoT-devices. Here, we envision a new paradigm called EdgeAI to address major impediments associated with deploying deep networks at the edge. Specifically, we discuss the existing directions in computation-aware deep learning and describe two new challenges in the IoT era: (1) Data-independent deployment of learning, and (2) Communication-aware distributed inference. We further present new directions from our recent research to alleviate the latter two challenges. Overcoming these challenges is crucial for rapid adoption of learning on IoT-devices in order to truly enable EdgeAI.

show abstract

Section: Communication-aware Model Compressionmentioning

confidence: 99%

Section: Network-of-neural Networkmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

EdgeAl: A Vision for Deep Learning in the IoT Era

2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…Different methods have been developed [207], [208] to partition a pre-trained DNN over several mobile devices in order to accelerate DNN inference on devices. Bhardwaj et al [209] further considered memory and communication costs in this distributed inference architecture, for which model compression and network science-based knowledge partitioning algorithm are proposed to address these issues. For robotics system where the model is partitioned between the edge server and the robot, the robot should take both local computation accuracy and offloading latency into account, and this offloading problem was formulated in [210] as a sequential decision making problem that is solved by a deep reinforcement learning algorithm.…”

Section: Computation Offloading Based Edge Inference Systemsmentioning

confidence: 99%

Communication-Efficient Edge AI: Algorithms and Systems

Shi

Yang

Jiang

et al. 2020

Preprint

View full text Add to dashboard Cite

Artificial intelligence (AI) has achieved remarkable breakthroughs in a wide range of fields, ranging from speech processing, image classification to drug discovery. This is driven by the explosive growth of data, advances in machine learning (especially deep learning), and easy access to vastly powerful computing resources. Particularly, the wide scale deployment of edge devices (e.g., IoT devices) generates an unprecedented scale of data, which provides the opportunity to derive accurate models and develop various intelligent applications at the network edge. However, such enormous data cannot all be sent from end devices to the cloud for processing, due to the varying channel quality, traffic congestion and/or privacy concerns. By pushing inference and training processes of AI models to edge nodes, edge AI has emerged as a promising alternative. AI at the edge requires close cooperation among edge devices, such as smart phones and smart vehicles, and edge servers at the wireless access points and base stations, which however result in heavy communication overheads. In this paper, we present a comprehensive survey of the recent developments in various techniques for overcoming these communication challenges. Specifically, we first identify key communication challenges in edge AI systems. We then introduce communication-efficient techniques, from both algorithmic and system perspectives for training and inference tasks at the network edge. Potential future research directions are also highlighted.

show abstract

Memory- and Communication-Aware Model Compression for Distributed Deep Learning Inference on IoT

Cited by 2 publications

References 0 publications

EdgeAl: A Vision for Deep Learning in the IoT Era

EdgeAl: A Vision for Deep Learning in the IoT Era

Communication-Efficient Edge AI: Algorithms and Systems

Contact Info

Product

Resources

About