Finite-Difference Time-Domain (FDTD) is a kernel usedto solve problems in electromagnetics applications such as microwave tomography. It is a data-intensive and computation-intensive problem. However, its computation scheme indicates that an architecture with SIMD support has the potential to bring performance improvement over traditional architectures without SIMD support. The Cell Broadband Engine (Cell/B.E.) processor is an implementation of a heterogeneous multicore architecture. It consists of one conventional microprocessor, PowerPC Processor Element (PPE), and eight SIMD co-processor elements, Synergistic Processor Elements (SPEs). One unique feature of an SPE is that it has 128-entry 128-bit uniform registers which support SIMD. Therefore, FDTD may be mapped well on Cell/B.E. processor. However, each SPE can directly access only 256KB local store (LS) both for instructions and data. The size of LS is much less than what is needed for an accurate simulation of FDTD which requires large number of fine-grained Yee cells. In this paper, we design the algorithm on Cell/B.E. by efficiently using the asynchronous DMA (direct memory access) mechanism available on an SPE transferring data between its LS and the main memory via the high bandwidth bus on-chip EIB (Element Interconnect Bus). The new algorithm was run on an IBM Blade QS20 blades running at 3.2GHz. For a computation domain of 600 × 600 Yee cells, we achieve an overall speedup of 14.14 over AMD Athlon and 7.05 over AMD Opteron at the processor level.
Breast cancer, with the exception of lung cancer, is the leading cause of cancer deaths in women. It is also one of the few cancers that can be controlled by using asymptomatic screening method, followed by effective treatments. One recent screening modality under development, microwave tomography, uses the apparent dielectric property contrasts between different breast tissues at microwave frequencies. Microwave tomography uses a numerical model and the image reconstruction consists of iteratively searching the breast structures, applying the numerical model to the breast structures, and matching measured data with computation results of the model. This paper focuses on Finite-Difference Time-Domain (FDTD) for the numerical model and Genetic Algorithm (GA) for the iterative searches. FDTD and GA are time-consuming, yet they are data parallel in nature. In this paper, a parallel algorithm integrating GA and FDTD for detecting tumors using microwave tomography technique is presented. The algorithm is implemented on distributed memory machines using MPI.
Algebraic reconstruction techniques require about half the number of projections as that of Fourier backprojection methods, which makes these methods safer in terms of required radiation dose. Algebraic reconstruction technique (ART) and its variant OS-SART (ordered subset simultaneous ART) are techniques that provide faster convergence with comparatively good image quality. However, the prohibitively long processing time of these techniques prevents their adoption in commercial CT machines. Parallel computing is one solution to this problem. With the advent of heterogeneous multicore
architectures that exploit data parallel applications, medical imaging algorithms such as OS-SART can be studied to produce increased performance. In this paper, we map OS-SART on cell broadband engine (Cell BE). We effectively use the architectural features of Cell BE to provide an efficient mapping. The Cell BE consists of one powerPC processor element (PPE) and eight SIMD coprocessors known as synergetic processor elements (SPEs). The limited memory storage on each of the SPEs makes the mapping challenging. Therefore, we present optimization techniques to efficiently map the algorithm on the Cell BE for improved performance over CPU version. We compare the performance of our proposed algorithm on Cell BE to that of Sun Fire ×4600, a shared memory machine. The Cell BE is five times faster than AMD Opteron dual-core processor. The speedup of the algorithm on Cell BE increases with the increase in the number of SPEs. We also experiment with various parameters, such as number of subsets, number of processing elements, and number of DMA transfers between main memory and local memory, that impact the performance of the algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.