2019
DOI: 10.1155/2019/3679839
|View full text |Cite
|
Sign up to set email alerts
|

Implementing and Evaluating an Heterogeneous, Scalable, Tridiagonal Linear System Solver with OpenCL to Target FPGAs, GPUs, and CPUs

Abstract: Solving diagonally dominant tridiagonal linear systems is a common problem in scientific high-performance computing (HPC). Furthermore, it is becoming more commonplace for HPC platforms to utilise a heterogeneous combination of computing devices. Whilst it is desirable to design faster implementations of parallel linear system solvers, power consumption concerns are increasing in priority. This work presents the oclspkt routine. The oclspkt routine is a heterogeneous OpenCL implementation of the truncated SPIK… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(11 citation statements)
references
References 15 publications
0
10
0
Order By: Relevance
“…With the introduction of High-Level synthesis (HLS) tools, a number of more recent works [14], [15], [16], [29] implemented the Thomas, PCR, and Spike algorithms on FPGA using HLS tools. Many of these works did not demonstrate the solver working on full applications, with the exception of Lászl ó et al in 2015 [14] which compared a one factor Black-Scholes option pricing equation using explicit and implicit methods on different architectures such as multi core CPUs, GPUs, and FPGAs.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…With the introduction of High-Level synthesis (HLS) tools, a number of more recent works [14], [15], [16], [29] implemented the Thomas, PCR, and Spike algorithms on FPGA using HLS tools. Many of these works did not demonstrate the solver working on full applications, with the exception of Lászl ó et al in 2015 [14] which compared a one factor Black-Scholes option pricing equation using explicit and implicit methods on different architectures such as multi core CPUs, GPUs, and FPGAs.…”
Section: Related Workmentioning
confidence: 99%
“…The FPGA performance with PCR is shown to be comparable to that of the GPU, but the Spike algorithm on the FPGA outperforms the GPU. Similarly Macintosh, et al in 2019 [15] uses OpenCL to develop oclspkt, a library that implements tridiagonal systems solvers targeting FPGAs, GPUs, and CPUs. oclspkt uses the truncated spike algorithm, and as such will not give exact solutions.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…OpenMP is also not the only way to access SIMD within C/C++. For example, OpenCL kernels may be compiled for CPUs that support SIMD units (Hurn et al, 2016;Macintosh et al, 2019). Instruction level intrinsic functions (available in R via the RcppXsimd package) allow advanced features such as efficient random variates for small vectors, but this approach is very challenging and akin to machine code.…”
Section: Discussionmentioning
confidence: 99%
“…e architecture was implemented in Xilinx Kintex-7 FPGA and compared to the software algorithm. FPGAs offer high flexibility to Application-Specific Integrated Circuit (ASIC) when implementing the algorithm with a high degree of parallelism [9,10]. Results show that 37-75 times performance enhancement could be achieved with the accelerator's clock frequency at 100 MHz.…”
Section: Introductionmentioning
confidence: 99%