Abstract:Image registration is computationally intensive, and hence difficult to implement in real-time. In recent efforts, image registration algorithms have been implemented in field-programmable gate array (FPGA) technology to improve performance, while providing programmability and dynamic reconfigurability. In this paper, we present a novel architecture for dynamically-reconfigurable image registration, along with details on the methodology used to derive the architecture. Unlike previous FPGA implementations for … Show more
“…A dynamically-reconfigurable FPGA implementation was proposed based on the PVV metric in [8]. We considered intervoxel parallelism along with intra-voxel parallelism, and we developed a comparison of the associated performance gains in [8].…”
Section: Resultsmentioning
confidence: 99%
“…We considered intervoxel parallelism along with intra-voxel parallelism, and we developed a comparison of the associated performance gains in [8].…”
This paper develops techniques for mapping rigid image registration applications onto configurable hardware. Image registration is a computationally intensive domain that places stringent requirements on performance and memory management efficiency. Building on the framework of homogeneous parameterized dataflow, which provides an effective formal model for design and analysis of hardware and software for signal processing applications, we develop novel methods for representing and exploring the hardware design space when mapping image registration algorithms into configurable hardware. Our techniques result in an efficient framework for trading off performance and configurable hardware resource usage based on the constraints of a given registration application.
“…A dynamically-reconfigurable FPGA implementation was proposed based on the PVV metric in [8]. We considered intervoxel parallelism along with intra-voxel parallelism, and we developed a comparison of the associated performance gains in [8].…”
Section: Resultsmentioning
confidence: 99%
“…We considered intervoxel parallelism along with intra-voxel parallelism, and we developed a comparison of the associated performance gains in [8].…”
This paper develops techniques for mapping rigid image registration applications onto configurable hardware. Image registration is a computationally intensive domain that places stringent requirements on performance and memory management efficiency. Building on the framework of homogeneous parameterized dataflow, which provides an effective formal model for design and analysis of hardware and software for signal processing applications, we develop novel methods for representing and exploring the hardware design space when mapping image registration algorithms into configurable hardware. Our techniques result in an efficient framework for trading off performance and configurable hardware resource usage based on the constraints of a given registration application.
“…With a streaming interface and coupling with high level descriptions in MATLAB, this developing environment enables fast and optimized implementations on medical image processing. An FPGA based mutual information evaluation system for IR is proposed in [4]. The system utilizes a data flow model to improve hardware parallelism by sub-volume division.…”
This paper proposes techniques for accelerating a software based image registration algorithm for 3D medical images targeting a reconfigurable hardware platform. Various methods, including dedicated fixed point arithmetic, error model based bit width analysis, architecture exploration and application-specific memory modules, are applied to address issues from the software algorithm and to maximize the performance of FPGA technology. Based on the reconfigurability of FPGA devices, the system can be extended to swap modules optimized for different parameters, and to adopt more advanced registration algorithms. We show that a single core on 412MHz XC5VLX330T FPGA can evaluate a rigid transformation of a 3D image with 16 million voxels in 35ms. With 30 cores on an FPGA, it is over 108 times faster than a multi-threaded implementation running on a 2.5GHz Intel Quad-Core Xeon platform.
“…In [18], Sen et al presented a scheme that utilizes intra-and inter-voxel parallelization to optimize the coordinate transformation unit. Among actors C, D, E and F we find the same degree of intra voxel parallelization.…”
Section: Non Rigid Registration -Dataflow Modeling and Associated Anamentioning
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.