FPGA-based accelerators are increasingly popular across a broad range of applications, because they offer massive parallelism, high energy efficiency, and great flexibility for customizations. However, difficulties in programming and integrating FPGAs have hindered their widespread adoption. Since the mid 2000s, there has been extensive research and development toward making FPGAs accessible to software-inclined developers, besides hardware specialists. Many programming models and automated synthesis tools, such as high-level synthesis, have been proposed to tackle this grand challenge. In this survey, we describe the progression and future prospects of the ongoing journey in significantly improving the software programmability of FPGAs. We first provide a taxonomy of the essential techniques for building a high-performance FPGA accelerator, which requires customizations of the compute engines, memory hierarchy, and data representations. We then summarize a rich spectrum of work on programming abstractions and optimizing compilers that provide different trade-offs between performance and productivity. Finally, we highlight several additional challenges and opportunities that deserve extra attention by the community to bring FPGA-based computing to the masses.
Phase analysis, which classifies the set of execution intervals with similar execution behavior and resource requirements, has been widely used in a variety of dynamic systems, including dynamic cache reconfiguration, prefetching and race detection. While phase granularity has been a major factor to the accuracy of phase prediction, it has not been well investigated yet and most dynamic systems usually adopt a fine-grained prediction scheme. However, such a scheme can only take account of recent local phase information and could be frequently interfered by temporary noises due to instant phase changes, which might notably limit the prediction accuracy. In this paper, we make the first investigation on the potential of multi-level phase analysis (MLPA), where different granularity phase analysis are combined together to improve the overall accuracy. The key observation is that a coarse-grained interval, which usually consists of stably-distributed fine-grained intervals, can be accurately identified based on the fine-grained intervals at the beginning of its execution. Based on the observation, we design and implement a MLPA scheme. In such a scheme, a coarse-grained phase is first identified based on the fine-grained intervals at the beginning of its execution. The following fine-grained phases in it are then predicted based on the sequence of fine-grained phases in the coarse-grained phase. Experimental results show such a scheme can notably improve the prediction accuracy. Using Markov fine-grained phase predictor as the baseline, MLPA can improve prediction accuracy by 20%, 39% and 29% for next phase, phase change and phase length prediction for SPEC2000 accordingly, yet incur only about 2% time overhead and 40% space overhead (about 360 bytes in total). To demonstrate the effectiveness of MLPA, we apply it to a dynamic cache reconfiguration system which dynamically adjusts the cache size to reduce the power consumption and access time of data cache. Experimental results show that MLPA can further reduce the average cache size by 15% compared to the fine-grained scheme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.