2010 IEEE Computer Society Annual Symposium on VLSI 2010
DOI: 10.1109/isvlsi.2010.84
|View full text |Cite
|
Sign up to set email alerts
|

BLAS Comparison on FPGA, CPU and GPU

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
50
1

Year Published

2011
2011
2021
2021

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 125 publications
(51 citation statements)
references
References 9 publications
0
50
1
Order By: Relevance
“…We use some routines from the Basic Linear Algebra Subprograms (BLAS) [36]- [38] and the Linear Algebra PACKage (LAPACK) [39], [40] for key parts of the likelihood computations. Thereby we can guarantee high and portable performance of the computational core routines across a large variety of present and future platforms (E.g., [41]). …”
Section: Methodsmentioning
confidence: 99%
“…We use some routines from the Basic Linear Algebra Subprograms (BLAS) [36]- [38] and the Linear Algebra PACKage (LAPACK) [39], [40] for key parts of the likelihood computations. Thereby we can guarantee high and portable performance of the computational core routines across a large variety of present and future platforms (E.g., [41]). …”
Section: Methodsmentioning
confidence: 99%
“…Moreover, many comparative studies indicate that Field Programmable Gate Arrays (FPGAs) can often achieve better comprehensive properties than other platforms in most cases. For example, in the work of Zou et al [9], the efficiency of the FPGA implementation of the Smith-Waterman Algorithm is 3.4× compared to the Graphics Processing Unit (GPU) and over 40× compared to the Central Processing Unit (CPU), while Kestur et al [10] demonstrated that FPGA has similar performance at higher energy efficiency compared to the CPU and GPU platforms.…”
Section: Introductionmentioning
confidence: 99%
“…Custom built FPGA hardware is more efficient than general purpose components [10], [7]. Power-consumption -a first-class constraint This work has been funded by the Artemis PaPP project number 295440 -can also be reduced compared to CPUs or GPUs [11], [8]. FPGAs can be configured to have far more parallel throughput than their general purpose counterparts.…”
Section: Introductionmentioning
confidence: 99%