Accelerating the CNN Inference on FPGAs

Abdelouahab, Kamel; Pelcat, Maxime; Berry, François

doi:10.1201/9781351003827-1

Cited by 49 publications

(55 citation statements)

References 57 publications

(124 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…GPUs are the most widely used platforms to implement CNNs due to their processing power (up to 11 TFLOP/s), FPGA is a real alternative for real-time radar data analysis in terms of power consumption (vs GPUs), rapid prototyping, and massively parallel computing capabilities at different data rates [28]. As a result, numerous FPGA-based CNN accelerators have been proposed, targeting both High Performance Computing data-centers and embedded applications [29,30].…”

Section: IVmentioning

confidence: 99%

See 1 more Smart Citation

Radar Signal Processing for Sensing in Assisted Living: The Challenges Associated With Real-Time Implementation of Emerging Algorithms

Kernec

Fioranelli

Ding

et al. 2019

IEEE Signal Process. Mag.

126

View full text Add to dashboard Cite

This article presents radar signal processing for sensing in the context of assisted living. This is covered through 3 example applications: human activity recognition for activities of daily living, respiratory disorder and Sleep Stages classification. The common challenge of classification is discussed within a framework of measurements/pre-processing, feature extraction, and classification algorithms for supervised learning. Then, the specific challenges of the 3 applications from a signal processing standpoint are detailed in their specific data processing and ad-hoc classification strategies, focusing on recent trends in the field of activity recognition (multidomain, multi-modal and fusion) and healthcare applications based on vital signs (super-resolution techniques) and commenting on outstanding challenges. To conclude, this paper explores the challenge of the real-time implementation of signal processing/classification algorithms.

show abstract

Section: IVmentioning

confidence: 99%

“…Regarding real-time implementation, fixed-point arithmetic is privileged, limiting the performances and significantly decreasing the accuracy [39]. However, half-precision floating-point format seems to be interesting to address future implementations on FPGA as well as approximated computing to maintain good energy-performance trade-offs [30].…”

Section: IVmentioning

confidence: 99%

Radar Signal Processing for Sensing in Assisted Living: The Challenges Associated With Real-Time Implementation of Emerging Algorithms

Kernec

Fioranelli

Ding

et al. 2019

IEEE Signal Process. Mag.

126

View full text Add to dashboard Cite

show abstract

“…Neural network pruning, on the other hand, is the process of removing some weights [18] or entire convolution filters [19,27] and their associated feature maps in a neural network, in order to extract a functional "sub-network" that has a lower computational complexity and similar accuracy. A lot of work has been done on enabling and accelerating NN inference on FPGA [1], with tools being released to automatically generate HDL code for any neural network architecture [10], with automated use of quantization and other simplification techniques. These approaches are orthogonal to our work as they could be applied to any neural network.…”

Section: Neural Network On Edge Devicesmentioning

confidence: 99%

Low-Power Neural Networks for Semantic Segmentation of Satellite Images

Bahl

Daniel

Moretti

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

Semantic segmentation methods have made impressive progress with deep learning. However, while achieving higher and higher accuracy, state-of-the-art neural networks overlook the complexity of architectures, which typically feature dozens of millions of trainable parameters. Consequently, these networks requires high computational ressources and are mostly not suited to perform on edge devices with tight resource constraints, such as phones, drones, or satellites. In this work, we propose two highlycompact neural network architectures for semantic segmentation of images, which are up to 100 000 times less complex than state-of-the-art architectures while approaching their accuracy. To decrease the complexity of existing networks, our main ideas consist in exploiting lightweight encoders and decoders with depth-wise separable convolutions and decreasing memory usage with the removal of skip connections between encoder and decoder. Our architectures are designed to be implemented on a basic FPGA such as the one featured on the Intel Altera Cyclone V family of SoCs. We demonstrate the potential of our solutions in the case of binary segmentation of remote sensing images, in particular for extracting clouds and trees from RGB satellite images.

show abstract

“…A custom hardware accelerator has challenges of its own, including cost of the hardware, as well as time-to-market for the acceleration solution [1] [2]. Field Programmable Gate Arrays (FPGAs) have proven to be reliable accelerators for rapidly changing industries.…”

Section: Introductionmentioning

confidence: 99%

Hardware Acceleration of Computer Vision and Deep Learning Algorithms on the Edge using OpenCL

Mishra¹,

Chakraborty²,

Makkadayil³

et al. 2019

EAI Endorsed Transactions on Cloud Systems

View full text Add to dashboard Cite

Machine vision using CNN is a key application in Industrial automation environment, enabling real time as well as offline analytics. A lot of processing is required in real time, and in high speed environment variable latency of data transfer makes a cloud solution unreliable. There is a need for application specific hardware acceleration to process CNNs and traditional computer vision algorithms. Cost and time-to-market are critical factors in the fast moving Industrial automation segment which makes RTL based custom hardware accelerators infeasible. This work proposes a low-cost, scalable, compute-at-the-edge solution using FPGA and OpenCL. The paper proposes a methodology that can be used to accelerate traditional as well as machine learning based computer vision algorithms.

show abstract

Accelerating the CNN Inference on FPGAs

Cited by 49 publications

References 57 publications

Radar Signal Processing for Sensing in Assisted Living: The Challenges Associated With Real-Time Implementation of Emerging Algorithms

Radar Signal Processing for Sensing in Assisted Living: The Challenges Associated With Real-Time Implementation of Emerging Algorithms

Low-Power Neural Networks for Semantic Segmentation of Satellite Images

Hardware Acceleration of Computer Vision and Deep Learning Algorithms on the Edge using OpenCL

Contact Info

Product

Resources

About