In multiple ways, Year 2011 marked an important transition for FPGA high-level synthesis (HLS), as it went from prototyping to deployment. A decade later, in this paper, we assess the progress of the deployment of HLS technology and highlight the successes in several application domains, including deep learning, video transcoding, graph processing, and genome sequencing. We also discuss the challenges faced by today’s HLS technology and the opportunities for further research and development, especially in the areas of achieving high clock frequency, coping with complex pragmas and system integration, legacy code transformation, building on open-source HLS infrastructures, supporting domain-specific languages, and standardization. We hope that this paper can inspire more research on FPGA HLS and bring it to a new height.
C/C++/OpenCL-based high-level synthesis (HLS) becomes more and more popular for eld-programmable gate array (FPGA) accelerators in many application domains in recent years, thanks to its competitive quality of result (QoR) and short development cycle compared with the traditional register-transfer level (RTL) design approach. Yet, limited by the sequential C semantics, it remains challenging to adopt the same highly productive high-level programming approach in many other application domains, where coarse-grained tasks run in parallel and communicate with each other at a ne-grained level. While current HLS tools support taskparallel programs, the productivity is greatly limited in the code development, correctness veri cation, and QoR tuning cycles, due to the poor programmability, restricted so ware simulation, and slow code generation, respectively. Such limited productivity o en defeats the purpose of HLS and hinder programmers from adopting HLS for task-parallel FPGA accelerators.In this paper, we extend the HLS C++ language and present a fully automated framework with programmer-friendly interfaces, universal so ware simulation, and fast code generation to overcome these limitations. Experimental results based on a wide range of real-world task-parallel programs show that, on average, the lines of kernel and host code are reduced by 22% and 51%, respectively, which considerably improves the programmability. e correctness veri cation and the iterative QoR tuning cycles are both greatly accelerated by 3.2× and 6.8×, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.