This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volumetric visualization of large rectilinear scalar fields. By employing pipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow mapping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators.
The
computation of two-electron repulsion integrals (ERIs) is often
the most expensive step of integral-direct self-consistent field methods.
Formally it scales as O(N
4), where N is the number of Gaussian basis functions
used to represent the molecular wave function. In practice, this scaling
can be reduced to O(N
2) or less by neglecting small integrals with screening methods. The
contributions of the ERIs to the Fock matrix are of Coulomb (J) and
exchange (K) type and require separate algorithms to compute matrix
elements efficiently. We previously implemented highly efficient GPU-accelerated
J-matrix and K-matrix algorithms in the electronic structure code
TeraChem. Although these implementations supported the use of multiple
GPUs on a node, they did not support the use of multiple nodes. This
presents a key bottleneck to cutting-edge ab initio simulations of
large systems, e.g., excited state dynamics of photoactive proteins.
We present our implementation of multinode multi-GPU J- and K-matrix
algorithms in TeraChem using the Regent programming language. Regent
directly supports distributed computation in a task-based model and
can generate code for a variety of architectures, including NVIDIA
GPUs. We demonstrate multinode scaling up to 45 GPUs (3 nodes) and
benchmark against hand-coded TeraChem integral code. We also outline
our metaprogrammed Regent implementation, which enables flexible code
generation for integrals of different angular momenta.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.