We present a compiler that takes high level signal and image processing algorithms described in MATLAB and generates an optimized hardware for an FPGA with external memory. We propose a precision analysis algorithm to determine the minimum number of bits required by an integer variable and a combined precision and error analysis algorithm to infer the minimum number of bits required by a floating point variable. Our results show that on an average, our algorithms generate hardware requiring a factor of 5 less FPGA resources in terms of the Configurable Logic Blocks (CLBs) consumed as compared to the hardware generated without these optimizations. We show that our analysis results in the reduction in the size of lookup tables for functions like sin, cos, sqrt, exp etc. Our precision analysis also enables us to pack various array elements into a single memory location to reduce the number of external memory accesses. We show that such a technique improves the performance of the generated hardware by an average of 35%.
We present an area and delay estimator in the context of a compiler that takes in high level signal and image processing applications described in MATLAB and performs automatic design space exploration to synthesize hardware for a Field Programmable Gate Array (FPGA) which meets the user area and frequency specifications. We present an area estimator which is used to estimate the maximum number of Configurable Logic Blocks (CLBs) consumed by the hardware synthesized for the Xilinx XC4010 from the input MATLAB algorithm. We also present a delay estimator which finds out the delay in the logic elements in the critical path and the delay in the interconnects. The total number of CLBs predicted by us is within 16% of the actual CLB consumption and the synthesized frequency estimated by us is within an error of 13% of the actual frequency after synthesis through Synplify logic synthesis tools and after placement and routing through the XACT tools from Xilinx. Since the estimators proposed by us are fast and accurate enough, they can be used in a high level synthesis framework like ours to perform rapid design space exploration.
Fast FPGA CAD tools that produce high quality results has been one o] the most important research issues in the FPGA domain. Simulated annealing has been the method of choice for placement. However, simulated annealing is a very compute-intensive method. In our present work we investigate a range of parallelization strategies to speedup simulated annealing with application to placement ]or FPGA. We present experimental results obtained by applying the different parallelization strategies to the Versatile Place and Route (VPR) Tool, implemented on an SGI Origin shared memory multiprocessor and an IBM-SP2 distributed memory multiprocessor. The results show the tradeoff between execution time and quality of result for the different parallelization strategies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.