Locality-Aware CTA Scheduling for Gaming Applications

Ukarande, Aditya; Patidar, Suryakant; Rangan, Ram

doi:10.1145/3477497

Cited by 2 publications

(1 citation statement)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another work [8] proposes a NUCA organization for the L1 texture caches to increase their effective overall capacity. Focusing on texture locality-aware workload scheduling to different shader cores with software modifications, Ukarande et al [48] report a 4% speedup when exploiting Texture Cache locality on high-end desktop graphics workloads. Another work [21] also exploits Texture Cache locality by scheduling quads that are closer in screen coordinates.…”

Section: Related Workmentioning

confidence: 99%

Boustrophedonic Frames: Quasi-Optimal L2 Caching for Textures in GPUs

Joseph,

Aragón,

Parcerisa

et al. 2023

2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT)

View full text Add to dashboard Cite

Literature is plentiful in works exploiting cache locality for GPUs. A majority of them explore replacement or bypassing policies. In this paper, however, we surpass this exploration by fabricating a formal proof for a no-overhead quasi-optimal caching technique for caching textures in graphics workloads. Textures make up a significant part of main memory traffic in mobile GPUs, which contributes to the total GPU energy consumption. Since texture accesses use a shared L2 cache, improving the L2 texture caching efficiency would decrease main memory traffic, thus improving energy efficiency, which is crucial for mobile GPUs. Our proposal reaches quasi-optimality by exploiting the frame-to-frame reuse of textures in graphics. We do this by traversing frames in a boustrophedonic 1 manner w.r.t. the frame-to-frame tile order. We first approximate the texture access trace to a circular trace and then forge a formal proof for our proposal being optimal for such traces. We also complement the proof with empirical data that demonstrates the quasi-optimality of our no-cost proposal.

show abstract

Section: Related Workmentioning

confidence: 99%