Proceedings of the 38th Annual International Symposium on Computer Architecture 2011
DOI: 10.1145/2000064.2000093
|View full text |Cite
|
Sign up to set email alerts
|

Energy-efficient mechanisms for managing thread context in throughput processors

Abstract: Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine register file caching to replace accesses to the large main register file with accesses to a s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
91
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
3
3
3

Relationship

1
8

Authors

Journals

citations
Cited by 214 publications
(105 citation statements)
references
References 30 publications
3
91
0
Order By: Relevance
“…Several researchers have proposed a variety of schedulers that preferentially schedule out of a small pool of warps [15], [16]. These two-level schedulers have been developed for a number of reasons, but all of them generally have the effect of reducing contention in the caches and memory subsystem by limiting the number of co-scheduled warps.…”
Section: Related Workmentioning
confidence: 99%
“…Several researchers have proposed a variety of schedulers that preferentially schedule out of a small pool of warps [15], [16]. These two-level schedulers have been developed for a number of reasons, but all of them generally have the effect of reducing contention in the caches and memory subsystem by limiting the number of co-scheduled warps.…”
Section: Related Workmentioning
confidence: 99%
“…Later, Gebhart et al [26] used a two-level warp scheduling technique so as to reduce the consumption of energy. The researchers noticed that the written registers are often read last within three instructions after they are written.…”
Section: Mechanism(tl W)mentioning
confidence: 99%
“…We have already provided quantitative comparisons of our proposal with the two-level scheduler. Gebhart and Johnson et al [12] propose a two-level warp scheduling technique that aims to reduce energy consumption in GPUs. Jog et al [19] propose OWL, a series of CTA-aware warp scheduling techniques to reduce cache contention and improve DRAM performance for bandwidth-limited GPGPU applications.…”
Section: Related Workmentioning
confidence: 99%