Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2019
DOI: 10.1145/3289602.3293918
|View full text |Cite
|
Sign up to set email alerts
|

Rapid Cycle-Accurate Simulator for High-Level Synthesis

Abstract: A large semantic gap between the high-level synthesis (HLS) design and the low-level (on-board or RTL) simulation environment often creates a barrier for those who are not FPGA experts. Moreover, such low-level simulation takes a long time to complete. Softwarebased HLS simulators can help bridge this gap and accelerate the simulation process; however, we found that the current FPGA HLS commercial software simulators sometimes produce incorrect results. In order to solve this correctness issue while maintainin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 19 publications
0
10
0
Order By: Relevance
“…Some simulation statistics are not reported by the designers (N/A). keypair [58], gsm [59], HLSCNN [60], FlexNLP [61], Dataflow [62], and Opticalflow [63] do not feature any subaccelerators with a batch size greater than one. One OOB bug was found in gsm and one initialization bug was found in keypair.…”
Section: Appendix D Results (Extended)mentioning
confidence: 99%
“…Some simulation statistics are not reported by the designers (N/A). keypair [58], gsm [59], HLSCNN [60], FlexNLP [61], Dataflow [62], and Opticalflow [63] do not feature any subaccelerators with a batch size greater than one. One OOB bug was found in gsm and one initialization bug was found in keypair.…”
Section: Appendix D Results (Extended)mentioning
confidence: 99%
“…Performance counters are inserted to the accelerator to collect the relevant metrics. We implement SPLAG using an open-source extension to HLS C++, TAPA [10], to leverage the convenient peeking interfaces, fast software simulation [6,12], asynchronous memory interfaces, simplified host-kernel interfaces, and coarse-grained floorplanning [26,27]. Our implementation targets the Alveo U280 board with 32 high-bandwidth memory (HBM) channels.…”
Section: Discussionmentioning
confidence: 99%
“…is problem was also pointed out in [8]. In order to run soware simulation correctly, the programmer can change the source code to run tasks in multiple threads for so ware simulation, but doing so requires the same piece of task instantiation code to be wri en twice for synthesis and simulation, reducing productivity.…”
Section: Motivating Examplementioning
confidence: 99%
“…However, they may perform poorly due to the ine ciency of inter-thread communication and context switch handled by the operating system. e FLASH simulator [8,12] proposed an alternative to the above, which relies on the HLS scheduling information to mimic the RTL FSM. While this simulation approach itself is faster than multi-thread simulators, generating simulation executable becomes slower due to the need of the HLS scheduler output for cycle-accuracy, which is not needed for correctness veri cation.…”
Section: So Ware Simulationmentioning
confidence: 99%