Feature Map Transform Coding for Energy-Efficient CNN Inference

Chmiel, Brian; Baskin, Chaim; Zheltonozhskii, Evgenii; Banner, Ron; Yermolin, Yevgeny; Karbachevsky, Alex; Bronstein, Alex; Mendelson, Avi

doi:10.1109/ijcnn48605.2020.9206968

Cited by 22 publications

(5 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In a CNN, the same set of weights, also called filters or kernels, is used to convolve the input feature map at different locations, resulting in different output feature maps. Each filter generates one output feature map, and the number of filters determines the number of output feature maps produced by the layer [9]. Each output feature map represents a set of learned spatial features that the layer is sensitive to.…”

Section: Software Approachesmentioning

confidence: 99%

Advancements in On-Device Deep Neural Networks

Saravanan,

Kouzani

2023

Information

View full text Add to dashboard Cite

In recent years, rapid advancements in both hardware and software technologies have resulted in the ability to execute artificial intelligence (AI) algorithms on low-resource devices. The combination of high-speed, low-power electronic hardware and efficient AI algorithms is driving the emergence of on-device AI. Deep neural networks (DNNs) are highly effective AI algorithms used for identifying patterns in complex data. DNNs, however, contain many parameters and operations that make them computationally intensive to execute. Accordingly, DNNs are usually executed on high-resource backend processors. This causes an increase in data processing latency and energy expenditure. Therefore, modern strategies are being developed to facilitate the implementation of DNNs on devices with limited resources. This paper presents a detailed review of the current methods and structures that have been developed to deploy DNNs on devices with limited resources. Firstly, an overview of DNNs is presented. Next, the methods used to implement DNNs on resource-constrained devices are explained. Following this, the existing works reported in the literature on the execution of DNNs on low-resource devices are reviewed. The reviewed works are classified into three categories: software, hardware, and hardware/software co-design. Then, a discussion on the reviewed approaches is given, followed by a list of challenges and future prospects of on-device AI, together with its emerging applications.

show abstract

Section: Software Approachesmentioning

confidence: 99%

Advancements in On-Device Deep Neural Networks

Saravanan,

Kouzani

2023

Information

View full text Add to dashboard Cite

show abstract

“…Lossy compression is those in which there is a loss of fidelity for natural images like photographs [17]. There are various lossy compression methods like transform coding [18], discrete cosine transform (DCT) [19], discrete wavelet transforms (DWT) [20], chroma subsampling [21], fractals lossless compression is generally used for medical imaging, drawings, comics. There are various methods for lossless compression like run-length coding, predictive coding, entropy coding, Huffman coding, Lempel Ziv Welch (LZW) [16], [22], [23].…”

Section: Introductionmentioning

confidence: 99%

Hybrid information security system via combination of compression, cryptography, and image steganography

Awadh

Alasady

Hamoud

2022

IJECE

View full text Add to dashboard Cite

<span lang="EN-US">Today, the world is experiencing a new paradigm characterized by dynamism and rapid change due to revolutions that have gone through information and digital communication technologies, this raised many security and capacity concerns about information security transmitted via the Internet network. Cryptography and steganography are two of the most extensively that are used to ensure information security. Those techniques alone are not suitable for high security of information, so in this paper, we proposed a new system was proposed of hiding information within the image to optimize security and capacity. This system provides a sequence of steps by compressing the secret image using discrete wavelet transform (DWT) algorithm, then using the advanced encryption standard (AES) algorithm for encryption compressed data. The least significant bit (LSB) technique has been applied to hide the encrypted data. The results show that the proposed system is able to optimize the stego-image quality (PSNR value of 47.8 dB) and structural similarity index (SSIM value of 0.92). In addition, the results of the experiment proved that the combination of techniques maintains stego-image quality by 68%, improves system performance by 44%, and increases the size of secret data compared to using each technique alone. This study may contribute to solving the problem of the security and capacity of information when sent over the internet.</span>

show abstract

“…To meet this rapidly increasing demand for AI capabilities on embedded systems, such as autonomous vehicles, drones, and medical devices, prior research focused on various techniques for reducing the power and energy consumption of NNs deployed on hardware accelerators, These techniques include network compression [Lebedev et al, 2015, Ullrich et al, 2017, Chmiel et al, 2020, Baskin et al, 2021a, pruning [Han et al, 2015, neural architecture search [Liu et al, 2019, Wu et al, 2019, Cai et al, 2019, and quantization [Zhou et al, 2016, Hubara et al, 2018.…”

Section: Introductionmentioning

confidence: 99%

FBM: Fast-Bit Allocation for Mixed-Precision Quantization

Kimhi¹,

Rozen²,

Kopetz³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Quantized neural networks are well known for reducing latency, power consumption, and model size without significant degradation in accuracy, making them highly applicable for systems with limited resources and low power requirements. Mixed precision quantization offers better utilization of customized hardware that supports arithmetic operations at different bitwidths. Existing mixed-precision schemes rely on having a high exploration space, resulting in a large carbon footprint. In addition, these bit allocation strategies mostly induce constraints on the model size rather than utilizing the performance of neural network deployment on specific hardware. Our work proposes Fast-Bit Allocation for Mixed-Precision Quantization (FBM), which finds an optimal bitwidth allocation by measuring desired behaviors through a simulation of a specific device, or even on a physical one. While dynamic transitions of bit allocation in mixed precision quantization with ultra-low bitwidth are known to suffer from performance degradation, we present a fast recovery solution from such transitions. A comprehensive evaluation of the proposed method on CIFAR-10 and ImageNet demonstrates our method's superiority over current state-of-the-art schemes in terms of the trade-off between neural network accuracy and hardware efficiency. Our source code, experimental settings and quantized models are available at https://github.com/RamorayDrake/FBM/ Preprint. Under review.

show abstract

Feature Map Transform Coding for Energy-Efficient CNN Inference

Cited by 22 publications

References 25 publications

Advancements in On-Device Deep Neural Networks

Advancements in On-Device Deep Neural Networks

Hybrid information security system via combination of compression, cryptography, and image steganography

FBM: Fast-Bit Allocation for Mixed-Precision Quantization

Contact Info

Product

Resources

About