2022
DOI: 10.3390/s22062184
|View full text |Cite
|
Sign up to set email alerts
|

Customizable FPGA-Based Hardware Accelerator for Standard Convolution Processes Empowered with Quantization Applied to LiDAR Data

Abstract: In recent years there has been an increase in the number of research and developments in deep learning solutions for object detection applied to driverless vehicles. This application benefited from the growing trend felt in innovative perception solutions, such as LiDAR sensors. Currently, this is the preferred device to accomplish those tasks in autonomous vehicles. There is a broad variety of research works on models based on point clouds, standing out for being efficient and robust in their intended tasks, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 37 publications
0
7
0
Order By: Relevance
“…Quantization of the ANN model is a crucial step for successful deployment. It is the process of reducing the bit width of a deep learning model's weights and activation functions by sharing parameters, decreasing hardware resource usage, and consequently optimizing the model for the target FPGA [40]. Apache TVM can convert a high-level ANN model into a deployable quantized module on a range of hardware platforms.…”
Section: Ai-accelerator Vtamentioning
confidence: 99%
See 1 more Smart Citation
“…Quantization of the ANN model is a crucial step for successful deployment. It is the process of reducing the bit width of a deep learning model's weights and activation functions by sharing parameters, decreasing hardware resource usage, and consequently optimizing the model for the target FPGA [40]. Apache TVM can convert a high-level ANN model into a deployable quantized module on a range of hardware platforms.…”
Section: Ai-accelerator Vtamentioning
confidence: 99%
“…In addition, it utilizes VTA to perform further quantization to optimize the ANN models by creating a balance between model accuracy and FPGA resource constraints. The purpose of such a quantization process is to enhance time efficiency and better resource utilization since quantization affects the performance of a model as a function of the model depth [40]. Other than quantization, VTA also performs other operations, including the fetch, load, compute, and store, that work together to manage the data flow and optimize the performance of the inference process.…”
Section: Ai-accelerator Vtamentioning
confidence: 99%
“…Compared to other hardware convolution implementations, the voting convolution has a low consumption of both area and power. The approach followed in [25], for an efficient convolution implementation inspired by [26], presents a total consumption of 10,832 LUTs and the number of DSPs is proportional to the filter size multiplied by the number of allocated processing elements. The declared total power consumption is 1.739 Watts for just one convolution, being almost 8.7 times higher than the consumption required by the Voting Block.…”
Section: Functional Validationmentioning
confidence: 99%
“…During the various tests, the dense convolution implemented in [25] was used as a reference. Equation ( 3) presents an approximation of the processing time of the traditional convolution implemented in [25] according to the size of the input feature map (considering both IFM_channels and OFM_channels equal to one).…”
Section: Sparsity Effectmentioning
confidence: 99%
“…In most cases, the idea is to implement a processing element that can run convolutions efficiently, on which the different layers are sequentially mapped, storing the intermediate result in main or local memory. For instance, Silva et al develop a highly configurable convolution block on an FPGA to accelerate object detection in autonomous driving applications [ 40 ]. Yan et al also develop an accelerator on FPGA, optimizing the design using resource multiplexing and parallel processing, limiting the implementation to kernels, to avoid issues with reconfiguration [ 41 ].…”
Section: Introductionmentioning
confidence: 99%