2022
DOI: 10.1145/3517131
|View full text |Cite
|
Sign up to set email alerts
|

Demystifying the Soft and Hardened Memory Systems of Modern FPGAs for Software Programmers through Microbenchmarking

Abstract: Both modern datacenter and embedded FPGAs provide great opportunities for high-performance and high energy-efficiency computing. With the growing public availability of FPGAs from major cloud service providers such as AWS, Alibaba, and Nimbix, as well as uniform hardware accelerator development tools (such as Xilinx Vitis and Intel oneAPI) for software programmers, hardware and software developers can now easily access FPGA platforms. However, it is nontrivial to develop efficient FPGA accelerators, especially… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 30 publications
0
1
0
Order By: Relevance
“…Thus, for each variable in an unsatisfied clause (line 8), approximately O × K ×4B of data will be read from DRAM. Considering that the DRAM channels in Alveo families have an initial access latency of 110 ns and a peak read bandwidth of 17.9 GB/s [27], the time required to read the clause and the literal indices can be approximated as (110 ns + (O × K ×4B) / 17.9 GB/s). Then the time taken for a flip can be estimated by considering that the loop in line 8 iterates K times and that there are 4 DRAM Based on the estimation model described above, Table 5 presents the FPGA-only throughput comparison between FYalSAT and the conventional WalkSAT FPGA accelerator architectures [14], [16], [17].…”
Section: ) Throughput and Resource Consumptionmentioning
confidence: 99%
“…Thus, for each variable in an unsatisfied clause (line 8), approximately O × K ×4B of data will be read from DRAM. Considering that the DRAM channels in Alveo families have an initial access latency of 110 ns and a peak read bandwidth of 17.9 GB/s [27], the time required to read the clause and the literal indices can be approximated as (110 ns + (O × K ×4B) / 17.9 GB/s). Then the time taken for a flip can be estimated by considering that the loop in line 8 iterates K times and that there are 4 DRAM Based on the estimation model described above, Table 5 presents the FPGA-only throughput comparison between FYalSAT and the conventional WalkSAT FPGA accelerator architectures [14], [16], [17].…”
Section: ) Throughput and Resource Consumptionmentioning
confidence: 99%