This article presents fast online placement methods for dynamically reconfigu ra ble systems, as well as offline 3D placement algorithms for static a I l y r e m nf i g U r a b I e arch it e c t u res , As FPGAs get larger and faster, both the number and complexity of the modules to load on them increase, hence better speedups can be achieved by exploiting FPGAs in hardware systems. Eokhale et al.$ report speedups of 200 times for the string matching problem. Adario et al.] achieve three times the pipelined implementation of image processing applications by exploiting dynamic reconfiguration of the hardware. Furthermore, the ability to reconfigure the chip as it is running enables the implementation of dynamically reconfigurable hardware systems that adapt themselves to the application for better p e i I~r m a n c e .~,~~,~~ Hauclc has reported many applications in reconfigurable systeins.ll Such systems usually consist of a host processor and an FPGA "coprocessor" called a reconfigurable functional unit (RFU). The RFU can be programmed in fhe course of the running time of dre program, with valying configurations in different stages oi the program.An example is shown i n Figure 1. As shown in Figure la, three parts of the code are mapped to RFU operations (RFUOPs, also called modules). When the program is running the Imp containing RWOP2 (time fl), two RFUOPs are loaded on the chip. Later, when the program is about to enter the loop at time 12, there is no space on the RFU to place RFUOPJ. Hence, RFUOPS is swapped out of the chip, and RFUOP3 is loaded. RFUOPl is still on the chip and can be reused later in the program.Unforlunately, raCher long delays in reprogramming RFUs keep 11s from achieving very high speedups for general-purpose computing? Wirtlilin ancl H~t c h i n g s~~ report an overall speedup of 23 times, while the speedup could, be 80 times i t configuration time was zero (the configuration time is 16% to 71% of the total running time).We need fast and powerful physical design CAD tools to do configuration management of the RFUs both offline and online. In the offline version, the flow of the program is known in advance (e.g., in DSP applications or loops containing basic blocks); hence, ihcscheduler and configuration management component can do various optimizations in the configuration of the RFU before the system starts running, On the contrary, in the online version, the decision on what operations should be launched is not known beforehand. The flow of the program is not known in advance; hence, the RFU configuration management should be done on the fly. An example of such a case is multithreading, in which the flow of the code cannot be determined beforehand.Both online and offline versions of the template placement algorithms are important for 68 0744-7476100/$10,00 @ZOO0 l€EE
IEEE Deslgn & last of Computers