“…With ever-more devices available on the edge, this network of devices must be exploited to improve model accuracy without increasing the communication latency. Towards this, we describe our recent work on Network-of-Neural Networks (NoNN) [1] for memoryand communication-aware model compression.…”
Section: Communication-aware Model Compressionmentioning
confidence: 99%
“…These patterns of activations reveal how knowledge learned by the teacher network is distributed at the final convolution layer. Therefore, we first use these patterns to create a filter activation network [1] which represents how knowledge is distributed across multiple filters (see Fig. 3(b)).…”
Section: Network-of-neural Networkmentioning
confidence: 99%
“…However, due to their enormous computational complexity, deploying such models on constrained devices has emerged as a critical bottleneck for large-scale adoption of intelligence at the IoT edge. It has been estimated that the number of connected IoT-devices will reach one trillion across various market segments by 2035 1 ; this provides us a unique opportunity for integrating widespread intelligence in edge devices. Such an exponential growth in IoT-devices necessitates new breakthroughs in Artificial Intelligence research that can more effectively deploy learning at the edge and, therefore, truly exploit the setup of trillions of IoT-devices.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, since IoT naturally implies a network of connected devices, it automatically opens the door to a new class of problems -communication-aware model compression [1]. For instance, many smart home/cities applications can have numerous connected IoT-sensors with, say, 500KB total memory per node.…”
The significant computational requirements of deep learning present a major bottleneck for its large-scale adoption on hardwareconstrained IoT-devices. Here, we envision a new paradigm called EdgeAI to address major impediments associated with deploying deep networks at the edge. Specifically, we discuss the existing directions in computation-aware deep learning and describe two new challenges in the IoT era: (1) Data-independent deployment of learning, and (2) Communication-aware distributed inference. We further present new directions from our recent research to alleviate the latter two challenges. Overcoming these challenges is crucial for rapid adoption of learning on IoT-devices in order to truly enable EdgeAI.
“…With ever-more devices available on the edge, this network of devices must be exploited to improve model accuracy without increasing the communication latency. Towards this, we describe our recent work on Network-of-Neural Networks (NoNN) [1] for memoryand communication-aware model compression.…”
Section: Communication-aware Model Compressionmentioning
confidence: 99%
“…These patterns of activations reveal how knowledge learned by the teacher network is distributed at the final convolution layer. Therefore, we first use these patterns to create a filter activation network [1] which represents how knowledge is distributed across multiple filters (see Fig. 3(b)).…”
Section: Network-of-neural Networkmentioning
confidence: 99%
“…However, due to their enormous computational complexity, deploying such models on constrained devices has emerged as a critical bottleneck for large-scale adoption of intelligence at the IoT edge. It has been estimated that the number of connected IoT-devices will reach one trillion across various market segments by 2035 1 ; this provides us a unique opportunity for integrating widespread intelligence in edge devices. Such an exponential growth in IoT-devices necessitates new breakthroughs in Artificial Intelligence research that can more effectively deploy learning at the edge and, therefore, truly exploit the setup of trillions of IoT-devices.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, since IoT naturally implies a network of connected devices, it automatically opens the door to a new class of problems -communication-aware model compression [1]. For instance, many smart home/cities applications can have numerous connected IoT-sensors with, say, 500KB total memory per node.…”
The significant computational requirements of deep learning present a major bottleneck for its large-scale adoption on hardwareconstrained IoT-devices. Here, we envision a new paradigm called EdgeAI to address major impediments associated with deploying deep networks at the edge. Specifically, we discuss the existing directions in computation-aware deep learning and describe two new challenges in the IoT era: (1) Data-independent deployment of learning, and (2) Communication-aware distributed inference. We further present new directions from our recent research to alleviate the latter two challenges. Overcoming these challenges is crucial for rapid adoption of learning on IoT-devices in order to truly enable EdgeAI.
“…Different methods have been developed [207], [208] to partition a pre-trained DNN over several mobile devices in order to accelerate DNN inference on devices. Bhardwaj et al [209] further considered memory and communication costs in this distributed inference architecture, for which model compression and network science-based knowledge partitioning algorithm are proposed to address these issues. For robotics system where the model is partitioned between the edge server and the robot, the robot should take both local computation accuracy and offloading latency into account, and this offloading problem was formulated in [210] as a sequential decision making problem that is solved by a deep reinforcement learning algorithm.…”
Section: Computation Offloading Based Edge Inference Systemsmentioning
Artificial intelligence (AI) has achieved remarkable breakthroughs in a wide range of fields, ranging from speech processing, image classification to drug discovery. This is driven by the explosive growth of data, advances in machine learning (especially deep learning), and easy access to vastly powerful computing resources. Particularly, the wide scale deployment of edge devices (e.g., IoT devices) generates an unprecedented scale of data, which provides the opportunity to derive accurate models and develop various intelligent applications at the network edge. However, such enormous data cannot all be sent from end devices to the cloud for processing, due to the varying channel quality, traffic congestion and/or privacy concerns. By pushing inference and training processes of AI models to edge nodes, edge AI has emerged as a promising alternative. AI at the edge requires close cooperation among edge devices, such as smart phones and smart vehicles, and edge servers at the wireless access points and base stations, which however result in heavy communication overheads. In this paper, we present a comprehensive survey of the recent developments in various techniques for overcoming these communication challenges. Specifically, we first identify key communication challenges in edge AI systems. We then introduce communication-efficient techniques, from both algorithmic and system perspectives for training and inference tasks at the network edge. Potential future research directions are also highlighted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.