2015
DOI: 10.1587/elex.12.20150161
|View full text |Cite
|
Sign up to set email alerts
|

A deeply-pipelined FPGA-based SpMV accelerator with a hardware-friendly storage scheme

Abstract: This paper presents a high performance sparse matrix-vector multiplication (SpMV) accelerator on the field-programming gate array (FPGA). By exploiting a hardware-friendly storage scheme, named as Variable-Bit-Width Coordinate Block Quasi Compressed Sparse Row, the redundant computation and memory accesses can be reduced greatly through the nested block compression and variable-bit-width column-index encoding schemes. Based on the proposed compression scheme, a deeply-pipelined SpMV accelerator is implemented … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 13 publications
0
1
0
Order By: Relevance
“…The source codes of this framework will be publicly released on GitHub for the further improvement the performance of this detection problem https://github.com/ylzhhn/ cervical_smear_3phases_detect_sys. In the future, the framework can also be accelerated by the parallel programming techniques [33,34]. Moreover, the analysis to recognise different types of the pre-cancerous cells is a meaningful research question in this field.…”
Section: Resultsmentioning
confidence: 99%
“…The source codes of this framework will be publicly released on GitHub for the further improvement the performance of this detection problem https://github.com/ylzhhn/ cervical_smear_3phases_detect_sys. In the future, the framework can also be accelerated by the parallel programming techniques [33,34]. Moreover, the analysis to recognise different types of the pre-cancerous cells is a meaningful research question in this field.…”
Section: Resultsmentioning
confidence: 99%
“…Notably, the ResNet network introduced the concept of residual blocks, enabling the training of deeper neural networks, which helps mitigate the vanishing gradient problem [14]. Work on FPGAs, which offer the advantages of low latency, low power consumption, and high flexibility over traditional hardware acceleration solutions, has been widely carried out [15][16][17][18][19][20][21][22][23][24][25][26][27]. However, they face limitations in on-chip resources, and modifications in network architecture necessitate hardware circuit redesign.…”
Section: Introductionmentioning
confidence: 99%