Abstract. Terrains and other geometric models have been traditionally stored locally. Their remote access presents the characteristics that are a combination of file serving and realtime streaming like audio-visual media. This paper presents a terrain streaming system based upon a client server architecture to handle heterogeneous clients over low-bandwidth networks. We present an efficient representation for handling terrains streaming. We design a client-server system that utilizes this representation to stream virtual environments containing terrains and overlayed geometry efficiently. We handle dynamic entities in environment and the synchronization of the same between multiple clients. We also present a method of sharing and storing terrain annotations for collaboration between multiple users. We conclude by presenting preliminary performance data for the streaming system.
The compute work rasterizer or the GigaThread Engine of a modern NVIDIA GPU focuses on maximizing compute work occupancy across all streaming multiprocessors in a GPU while retaining design simplicity. In this article, we identify the operational aspects of the GigaThread Engine that help it meet those goals but also lead to less-than-ideal cache locality for texture accesses in 2D compute shaders, which are an important optimization target for gaming applications. We develop three software techniques, namely
LargeCTAs
,
Swizzle
, and
Agents
, to show that it is possible to effectively exploit the texture data working set overlap intrinsic to 2D compute shaders.
We evaluate these techniques on gaming applications across two generations of NVIDIA GPUs, RTX 2080 and RTX 3080, and find that they are effective on both GPUs. We find that the bandwidth savings from all our software techniques on RTX 2080 is much higher than the bandwidth savings on baseline execution from inter-generational cache capacity increase going from RTX 2080 to RTX 3080. Our best-performing technique,
Agents
, records up to a 4.7% average full-frame speedup by reducing bandwidth demand of targeted shaders at the L1-L2 and L2-DRAM interfaces by 23% and 32%, respectively, on the latest generation RTX 3080. These results acutely highlight the sensitivity of cache locality to compute work rasterization order and the importance of locality-aware cooperative thread array scheduling for gaming applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.