2020
DOI: 10.1109/access.2020.2995330
|View full text |Cite
|
Sign up to set email alerts
|

A Power-Efficient Optimizing Framework FPGA Accelerator Based on Winograd for YOLO

Abstract: Accelerating deep learning networks in edge computing based on power-efficient and highly parallel FPGA platforms is an important goal. Combined with deep learning theory, an accelerator design method based on the Winograd algorithm for the deep learning object detection model YOLO under the PYNQ architecture is proposed. A Zynq FPGA is used to build the hardware acceleration platform of a YOLO network. The Winograd algorithm is used to improve traditional convolution. In the FPGA, the numerous multiplication … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 38 publications
(16 citation statements)
references
References 36 publications
(37 reference statements)
0
12
0
Order By: Relevance
“…Later, in the literature [ 31 ], the Winograd algorithm was proposed to be combined with the CNN sparsity to improve accelerator performance, but the model used in its evaluation was simple. Bao et al [ 32 ] used a fixed-point quantization approach to reduce FPGA resource consumption and proposed a buffer pipeline approach to further improve the accelerator efficiency while reducing the resource and power overhead. Wang et al [ 33 ] introduced a new unstructured sparse convolution algorithm using a lower quantization method and an end-to-end design space search sparse convolution dedicated circuit architecture, which achieved high computational efficiency, but its performance-to-power ratio was relatively low.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Later, in the literature [ 31 ], the Winograd algorithm was proposed to be combined with the CNN sparsity to improve accelerator performance, but the model used in its evaluation was simple. Bao et al [ 32 ] used a fixed-point quantization approach to reduce FPGA resource consumption and proposed a buffer pipeline approach to further improve the accelerator efficiency while reducing the resource and power overhead. Wang et al [ 33 ] introduced a new unstructured sparse convolution algorithm using a lower quantization method and an end-to-end design space search sparse convolution dedicated circuit architecture, which achieved high computational efficiency, but its performance-to-power ratio was relatively low.…”
Section: Background and Related Workmentioning
confidence: 99%
“…An extensive review of hardware acceleration methods from multiple points of view can be read from the review works of [12,13]. Some optimization methods include replacing the standard convolution algorithm altogether with faster algorithms such as fast Fourier transform (FFT) [14,15] or Winograd [16,17]. Other methods based on the transformation of convolution computation include performing convolution as matrix multiplication [18].…”
Section: Related Workmentioning
confidence: 99%
“…Miranda et al [18] achieved 30.8 mAP50 accuracy and 14 FPS for 8 bits precision, 31.5 mAP50 accuracy and 7 FPS for 16 bits precision on COCO dataset. Bao et al [19] also proposed a power efficient Yolov2 architecture with the pipelined network structure. PS runs Ubuntu OS with PYNQ [25] on it.…”
Section: Related Workmentioning
confidence: 99%