Fixed-Point Implementation of Convolutional Neural Networks for Image Classification

Lo, Chun Yan; Lau, Francis C. M.; Sham, Chiu-Wing

doi:10.1109/atc.2018.8587580

Cited by 26 publications

(11 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The extent of these complications has introduced shallow networks [525], approaches to quantizing model parameters [526] along with ternary [527] and binary [528] models that focus on reducing the memory overhead for efficient resource-constrained hardware accelerator implementation. Authors in [529] provide an example of a fixed-point CNN classifier involving 4 bit fixed point arithmetic that suggests negligible accuracy degradation and authors in [530] present fast BNN inference accelerators to meet the FPGA on-chip memory requirements. Reducing memory footprints in hardware accelerators is also tied up to the cost-effective designing of memory units.…”

Section: Current and Future Challengesmentioning

confidence: 99%

2022 roadmap on neuromorphic computing and engineering

Christensen

Dittmann

Linares-Barranco

et al. 2022

Neuromorph. Comput. Eng.

331

200

View full text Add to dashboard Cite

Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018 calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this Roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The Roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges for each research area. We hope that this Roadmap will be a useful resource by providing a concise yet comprehensive introduction to readers outside this field, for those who are just entering the field, as well as providing future perspectives for those who are well established in the neuromorphic computing community.

show abstract

Section: Current and Future Challengesmentioning

confidence: 99%

2022 roadmap on neuromorphic computing and engineering

Christensen

Dittmann

Linares-Barranco

et al. 2022

Neuromorph. Comput. Eng.

331

200

View full text Add to dashboard Cite

show abstract

“…Deep learning algorithms are usually implemented in software using 32 bit floating-point values (FP32). Migrating a deep learning algorithm to an ASIC or an FPGA requires for a bit width reduction which is possible using the quantization technique [ 24 , 25 , 26 ]. A quantized model and a non-quantized model execute the same operations, however, a quantized model with bit-width reduction promotes a memory reduction and allows the execution of more operations per cycle.…”

Section: State Of the Artmentioning

confidence: 99%

Customizable FPGA-Based Hardware Accelerator for Standard Convolution Processes Empowered with Quantization Applied to LiDAR Data

Silva

Pereira

Machado

et al. 2022

Sensors

View full text Add to dashboard Cite

In recent years there has been an increase in the number of research and developments in deep learning solutions for object detection applied to driverless vehicles. This application benefited from the growing trend felt in innovative perception solutions, such as LiDAR sensors. Currently, this is the preferred device to accomplish those tasks in autonomous vehicles. There is a broad variety of research works on models based on point clouds, standing out for being efficient and robust in their intended tasks, but they are also characterized by requiring point cloud processing times greater than the minimum required, given the risky nature of the application. This research work aims to provide a design and implementation of a hardware IP optimized for computing convolutions, rectified linear unit (ReLU), padding, and max pooling. This engine was designed to enable the configuration of features such as varying the size of the feature map, filter size, stride, number of inputs, number of filters, and the number of hardware resources required for a specific convolution. Performance results show that by resorting to parallelism and quantization approach, the proposed solution could reduce the amount of logical FPGA resources by 40 to 50%, enhancing the processing time by 50% while maintaining the deep learning operation accuracy.

show abstract

“…For instance, an 8-bit integer multiplier is much faster than its 32-bit floating-point counterpart. Some existing quantization techniques require retraining of the network (LQ-Nets [7], [15]), which is not flexible in many occasions. Hence, this work only focuses on post-training static quantization that does not require retraining.…”

Section: Background and Motivation Of Doubleq A Quantization Of Convolutional Neural Network (Cnn)mentioning

confidence: 99%

DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization

See

Ng²,

Tan

et al. 2021

IEEE Access

View full text Add to dashboard Cite

To fulfil the tight area and memory constraints in IoT applications, the design of efficient Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is one of the promising approach that allows the compression of large CNN into a much smaller one, which is very suitable for IoT applications. Among various proposed quantization schemes, Power-of-two (PoT) quantization enables efficient hardware implementation and small memory consumption for CNN accelerators, but requires retraining of CNN to retain its accuracy. This paper proposes a two-level post-training static quantization technique (DoubleQ) that combines the 8-bit and PoT weight quantization. The CNN weight is first quantized to 8-bit (level one), then further quantized to PoT (level two). This allows multiplication to be carried out using shifters, by expressing the weights in their PoT exponent form. DoubleQ also reduces the memory storage requirement for CNN, as only the exponent of the weights is needed for storage. However, DoubleQ trades the accuracy of the network for reduced memory storage. To recover the accuracy, a selection process (DoubleQExt) was proposed to strategically select some of the less informative layers in the network to be quantized with PoT at the second level. On ResNet-20, the proposed DoubleQ can reduce the memory consumption by 37.50% with 7.28% accuracy degradation compared to 8-bit quantization. By applying DoubleQExt, the accuracy is only degraded by 1.19% compared to 8-bit version while achieving a memory reduction of 23.05%. This result is also 1% more accurate than the state-of-the-art work (SegLog). The proposed DoubleQExt also allows flexible configuration to trade off the memory consumption with better accuracy, which is not found in the other state-of-the-art works. With the proposed two-level weight quantization, one can achieve a more efficient hardware architecture for CNN with minimal impact to the accuracy, which is crucial for IoT applications.

show abstract

Fixed-Point Implementation of Convolutional Neural Networks for Image Classification

Cited by 26 publications

References 1 publication

2022 roadmap on neuromorphic computing and engineering

2022 roadmap on neuromorphic computing and engineering

Customizable FPGA-Based Hardware Accelerator for Standard Convolution Processes Empowered with Quantization Applied to LiDAR Data

DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization

Contact Info

Product

Resources

About