1: IntroductionMorphoSys [1] is a reconfigurable SIMD processor targeted at portable devices, such as Cellular phone and PDAs. It combines coarse grain reconfigurable hardware with one general-purpose processor. Applications with a heterogeneous nature and different sub-tasks, such as MPEG, DVB-T, and CDMA, can be efficiently implemented on it. In this paper a 3D graphics algorithm, ray tracing, is mapped onto MorphoSys to achieve realistic illumination. We show that SIMD ray-tracing on MorphoSys is more efficient in power consumption and has a lower hardware cost than both multiprocessors and the single CPU approaches.Ray tracing [2] is a global illumination model. It is well known for its highly computation characteristic due to its recursive behavioral. Recent fast advancement of VLSI technology has helped achieving interactive ray tracing on a multiprocessor [3] and a cluster system [9,10] for large scenes, and on a single PC with SIMD extensions [4] for small scenes. In [3], Parker achieves 15 frames/second for a 512x512 image by running ray tracing on a 60-node (MIPS R12000) SGI origin 2000 system. Each node has a clock faster than 250MHz, 64-bit data paths, floating-point units, 64K L1 cache, 8MB L2 cache, and at least 64MB main memory. Muuss [9,10] worked on parallel and distributed ray tracing for over a decade. By using a cluster of SGI Power Challenge machines [10], a similar performance as Parker's is reached. Their work is different in their task granularity, load balancing and synchronization mechanisms. The disadvantage of their work is that there are extra costs such as high clock frequency, floating-point support, large memory bandwidth, efficient communication and scheduling mechanisms. Usually, the sub-division structure (such as BSP tree-Binary Space Partitioning [7,8]) is replicated in each processor during traversal. As will be seen, only one copy is saved in our implementation.Wald [4] used a single PC (Dual Pentium-III, 800Mhz, 256 MB) to render images of 512x512, and got 3.6 frames/second. 4-way Ray coherence is exploited by using SIMD instructions. The hardware support for floating-point, as well as the advanced branch prediction and speculation mechanisms helps speed up ray tracing. The ray incoherence is handled using a scheme similar to multi-pass scheme [14], which requires saving intermediate data, thus causing some processors to idle.The migration from fixed-function pipeline to programmable processors also makes ray tracing feasible on graphics hardware [5,6]. Purcell [6] proposes a ray-tracing mapping scheme on a pipelined graphics system with fragment stage programmable. The proposed processor requires floating-point support, and intends to exploit large parallelism. Multi-pass scheme is used to handle ray incoherence. As a result, the utilization of SIMD fragment processors is very low (less than 10%). Carr [5] mapped ray-object intersection onto a programmable shading hardware: Ray Engine. The Ray Engine is organized as a matrix, with vertical lines indexed by trian...
MorphoSys is a reconfigurable SIMD architecture. In this paper, a BSP-based ray tracing is gracefully mapped onto MorphoSys. The mapping highly exploits ray-tracing parallelism. A straightforward mechanism is used to handle irregularity among parallel rays in BSP. To support this mechanism, a special data structure is established, in which no intermediate data has to be saved. Moreover, optimizations such as object reordering and merging are facilitated. Data starvation is avoided by overlapping data transfer with intensive computation so that applications with different complexity can be managed efficiently. Since MorphoSys is small in size and power efficient, we demonstrate that MorphoSys is an economic platform for 3D animation applications on portable devices. 1: IntroductionMorphoSys [1] is a reconfigurable SIMD processor targeted at portable devices, such as Cellular phone and PDAs. It combines coarse grain reconfigurable hardware with one general-purpose processor. Applications with a heterogeneous nature and different sub-tasks, such as MPEG, DVB-T, and CDMA, can be efficiently implemented on it. In this paper a 3D graphics algorithm, ray tracing, is mapped onto MorphoSys to achieve realistic illumination. We show that SIMD ray-tracing on MorphoSys is more efficient in power consumption and has a lower hardware cost than both multiprocessors and the single CPU approaches.Ray tracing [2] is a global illumination model. It is well known for its highly computation characteristic due to its recursive behavioral. Recent fast advancement of VLSI technology has helped achieving interactive ray tracing on a multiprocessor [3] and a cluster system [9,10] for large scenes, and on a single PC with SIMD extensions [4] for small scenes. In [3], Parker achieves 15 frames/second for a 512x512 image by running ray tracing on a 60-node (MIPS R12000) SGI origin 2000 system. Each node has a clock faster than 250MHz, 64-bit data paths, floating-point units, 64K L1 cache, 8MB L2 cache, and at least 64MB main memory. Muuss [9,10] worked on parallel and distributed ray tracing for over a decade. By using a cluster of SGI Power Challenge machines [10], a similar performance as Parker's is reached. Their work is different in their task granularity, load balancing and synchronization mechanisms. The disadvantage of their work is that there are extra costs such as high clock frequency, floating-point support, large memory bandwidth, efficient communication and scheduling mechanisms. Usually, the sub-division structure (such as BSP tree-Binary Space Partitioning [7,8]) is replicated in each processor during traversal. As will be seen, only one copy is saved in our implementation.Wald [4] used a single PC (Dual Pentium-III, 800Mhz, 256 MB) to render images of 512x512, and got 3.6 frames/second. 4-way Ray coherence is exploited by using SIMD instructions. The hardware support for floating-point, as well as the advanced branch prediction and speculation mechanisms helps speed up ray tracing. The ray incoherence is handled using a sche...
MorphoSys is a reconfigurable SIMD architecture. In this paper, a BSP-based ray tracing is gracefully mapped onto MorphoSys. The mapping highly exploits ray-tracing parallelism. A straightforward mechanism is used to handle irregularity among parallel rays in BSP. To support this mechanism, a special data structure is established, in which no intermediate data has to be saved. Moreover, optimizations such as object reordering and merging are facilitated. Data starvation is avoided by overlapping data transfer with intensive computation so that applications with different complexity can be managed efficiently. Since MorphoSys is small in size and power efficient, we demonstrate that MorphoSys is an economic platform for 3D animation applications on portable devices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.