Reducing the amount of data stored by simulations will be of utmost importance for the next generation of large-scale computing. Accordingly, there is active research to shift analysis and visualization tasks to run in situ, that is, closer to the simulation via the sharing of some resources. This is beneficial as it can avoid the necessity of storing large amounts of data for post-processing. In this paper, we focus on the specific case of in situ visualization where analysis codes are collocated with the simulation's code and run on the same resources. It is important for an in situ technique to require minimum modifications to existing codes, be adaptable, and have a low impact on both run times and resource usage. We accomplish this through the Damaris/Viz framework, which provides in situ visualization support to the Damaris I/O middleware. The use of Damaris as a bridge to existing visualization packages allows us to (1) reduce code moditication to a minimum for existing simulations, (2) gather capabilities of several visualization tools to offer a unified data management interface, (3) use dedicated cores to hide the run time impact of in situ visualization and (4) efficiently use memory through a shared-memory-based communication model. Experiments are conducted on Blue Waters and Grid'5000 to visualize the CM1 atmospheric simulation and the Nek5000 CFD solver.
While many parallel visualization tools now provide in situ visualization capabilities, the trend has been to feed such tools with large amounts of unprocessed output data and let them render everything at the highest possible resolution. This leads to an increased run time of simulations that still have to complete within a fixed-length job allocation. In this paper, we tackle the challenge of enabling in situ visualization under performance constraints. Our approach shuffles data across processes according to its content and filters out part of it in order to feed a visualization pipeline with only a reorganized subset of the data produced by the simulation. Our framework leverages fast, generic evaluation procedures to score blocks of data, using information theory, statistics, and linear algebra. It monitors its own performance and adapts dynamically to achieve appropriate visual fidelity within predefined performance constraints. Experiments on the Blue Waters supercomputer with the CM1 simulation show that our approach enables a 5× speedup with respect to the initial visualization pipeline and is able to meet performance constraints.
Abstract-Remote visualization is an enabling technology aiming to resolve the barrier of physical distance. Although many researchers have developed innovative algorithms for remote visualization, previous work has focused little on systematically investigating optimal configurations of remote visualization architectures. In this paper, we study caching and prefetching, an important aspect of such architecture design, in order to optimize the fetch time in a remote visualization system. Unlike a processor cache or Web cache, caching for remote visualization is unique and complex. Through actual experimentation and numerical simulation, we have discovered ways to systematically evaluate and search for optimal configurations of remote visualization caches under various scenarios, such as different network speeds, sizes of data for user requests, prefetch schemes, cache depletion schemes, etc. We have also designed a practical infrastructure software to adaptively optimize the caching architecture of general remote visualization systems, when a different application is started or the network condition varies. The lower bound of achievable latency discovered with our approach can aid the design of remote visualization algorithms and the selection of suitable network layouts for a remote visualization system.
Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications.
The term “in situ processing” has evolved over the last decade to mean both a specific strategy for visualizing and analyzing data and an umbrella term for a processing paradigm. The resulting confusion makes it difficult for visualization and analysis scientists to communicate with each other and with their stakeholders. To address this problem, a group of over 50 experts convened with the goal of standardizing terminology. This paper summarizes their findings and proposes a new terminology for describing in situ systems. An important finding from this group was that in situ systems are best described via multiple, distinct axes: integration type, proximity, access, division of execution, operation controls, and output type. This paper discusses these axes, evaluates existing systems within the axes, and explores how currently used terms relate to the axes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.