Alan Heirich scite author profile

This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volumetric visualization of large rectilinear scalar fields. By employing pipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow mapping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators.

show abstract

Sepia: scalable 3D compositing using PCI Pamette

Moll¹,

Heirich

Shand

View full text Add to dashboard Cite

Scalable Load Balancing by Diffusion

Heirich¹

1994

View full text Add to dashboard Cite

Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language

Johnson

Mirchandaney

Hoag

et al. 2022

J. Chem. Theory Comput.

View full text Add to dashboard Cite

The computation of two-electron repulsion integrals (ERIs) is often the most expensive step of integral-direct self-consistent field methods. Formally it scales as O(N 4), where N is the number of Gaussian basis functions used to represent the molecular wave function. In practice, this scaling can be reduced to O(N 2) or less by neglecting small integrals with screening methods. The contributions of the ERIs to the Fock matrix are of Coulomb (J) and exchange (K) type and require separate algorithms to compute matrix elements efficiently. We previously implemented highly efficient GPU-accelerated J-matrix and K-matrix algorithms in the electronic structure code TeraChem. Although these implementations supported the use of multiple GPUs on a node, they did not support the use of multiple nodes. This presents a key bottleneck to cutting-edge ab initio simulations of large systems, e.g., excited state dynamics of photoactive proteins. We present our implementation of multinode multi-GPU J- and K-matrix algorithms in TeraChem using the Regent programming language. Regent directly supports distributed computation in a task-based model and can generate code for a variety of architectures, including NVIDIA GPUs. We demonstrate multinode scaling up to 45 GPUs (3 nodes) and benchmark against hand-coded TeraChem integral code. We also outline our metaprogrammed Regent implementation, which enables flexible code generation for integrals of different angular momenta.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alan Heirich

Efficient image-based methods for rendering soft shadows

Scalable interactive volume rendering using off-the-shelf components

Sepia: scalable 3D compositing using PCI Pamette

Scalable Load Balancing by Diffusion

Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language

Contact Info

Product

Resources

About