Getting rid of packets - Efficient SIMD single-ray traversal using multi-branching BVHs -

Wald, Ingo; Benthin, Carsten; Boulos, Solomon

doi:10.1109/rt.2008.4634620

Cited by 51 publications

(35 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The effect was particularly clear with trees wider than four. It seems likely, however, that our implementation was not as efficient as the one described by Wald et al [2008], because GTX285 does not have the prefix sum (compaction) instruction.…”

Section: Wide Treesmentioning

confidence: 99%

Understanding the efficiency of ray traversal on GPUs

Aila

Laine

2009

Proceedings of the Conference on High Performance Graphics 2009

331

273

View full text Add to dashboard Cite

We discuss the mapping of elementary ray tracing operationsacceleration structure traversal and primitive intersection-onto wide SIMD/SIMT machines. Our focus is on NVIDIA GPUs, but some of the observations should be valid for other wide machines as well. While several fast GPU tracing methods have been published, very little is actually understood about their performance. Nobody knows whether the methods are anywhere near the theoretically obtainable limits, and if not, what might be causing the discrepancy. We study this question by comparing the measurements against a simulator that tells the upper bound of performance for a given kernel. We observe that previously known methods are a factor of 1.5-2.5X off from theoretical optimum, and most of the gap is not explained by memory bandwidth, but rather by previously unidentified inefficiencies in hardware work distribution. We then propose a simple solution that significantly narrows the gap between simulation and measurement. This results in the fastest GPU ray tracer to date. We provide results for primary, ambient occlusion and diffuse interreflection rays.

show abstract

Section: Wide Treesmentioning

confidence: 99%

Understanding the efficiency of ray traversal on GPUs

Aila

Laine

2009

Proceedings of the Conference on High Performance Graphics 2009

331

273

View full text Add to dashboard Cite

show abstract

“…These kernels leverage CPU support for the vector instruction sets SSE and AVX, and are "hand optimized" to further improve performance (133). Using vector instruction allows for the efficient traversal of a BVH with branching factors that match the width of the SIMD lanes (131). The axis-aligned bounding boxes (AABBs) of the child nodes are fetched and tested within the CPU vector units, and, using the same technique, a group of triangles can be tested against a single ray at once.…”

Section: Intel Embreementioning

confidence: 99%

Performance Modeling of In Situ Rendering

Larsen¹,

Harrison²,

Kress³

et al. 2016

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

View full text Add to dashboard Cite

With the push to exascale, in situ visualization and analysis will play an increasingly important role in high performance computing. Tightly coupling in situ visualization with simulations constrains resources for both, and these constraints force a complex balance of trade-offs. A performance model that provides an a priori answer for the cost of using an in situ approach for a given task would assist in managing the trade-offs between simulation and visualization resources. In this work, we present new statistical performance models, based on algorithmic complexity, that accurately predict the run-time cost of a set of representative rendering algorithms, an essential in situ visualization task. To train and validate the models, we create data-parallel rendering algorithms within a light-weight in situ infrastructure, and we conduct a performance study of an MPI+X rendering infrastructure used in situ with three HPC simulation applications. We then explore feasibility issues using the model for selected in situ rendering questions.This dissertation includes previously published coauthored material. iv CURRICULUM VITAE

show abstract

“…For example, multibranching BVH structures enable multiple nodes or primitives to be tested for intersection against the same ray in parallel. Quadbranching BVH (BVH4) data structures perform well with 4-wide and 16-wide vector units [Ernst and Greiner 2008;Dammertz et al 2008;Benthin et al 2012], but higher branching factors offer diminishing returns [Wald et al 2008]. …”

Section: Single-ray Vectorizationmentioning

confidence: 99%

Embree

et al. 2014

Self Cite

View full text Add to dashboard Cite

Figure 1: Images produced by renderers which use the open source Embree ray tracing kernels. These scenes are computationally challenging due to complex geometry and spatially incoherent secondary rays. From left to right: The White Room model by Jay Hardy rendered in Autodesk RapidRT, a car model rendered in the Embree path tracer, a scene from the DreamWorks Animation movie "Peabody & Sherman" rendered with a prototype path tracer, and the Imperial Crown of Austria model by Martin Lubich rendered in the Embree path tracer. AbstractWe describe Embree, an open source ray tracing framework for x86 CPUs. Embree is explicitly designed to achieve high performance in professional rendering environments in which complex geometry and incoherent ray distributions are common. Embree consists of a set of low-level kernels that maximize utilization of modern CPU architectures, and an API which enables these kernels to be used in existing renderers with minimal programmer effort. In this paper, we describe the design goals and software architecture of Embree, and show that for secondary rays in particular, the performance of Embree is competitive with (and often higher than) existing stateof-the-art methods on CPUs and GPUs.

show abstract

Getting rid of packets - Efficient SIMD single-ray traversal using multi-branching BVHs -

Cited by 51 publications

References 23 publications

Understanding the efficiency of ray traversal on GPUs

Understanding the efficiency of ray traversal on GPUs

Performance Modeling of In Situ Rendering

Embree

Contact Info

Product

Resources

About