2006
DOI: 10.1109/micro.2006.31
|View full text |Cite
|
Sign up to set email alerts
|

Managing Distributed, Shared L2 Caches through OS-Level Page Allocation

Abstract: Abstract

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
202
0

Year Published

2010
2010
2016
2016

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 234 publications
(205 citation statements)
references
References 25 publications
3
202
0
Order By: Relevance
“…With page-coloring, one can mitigate the contention problem [4,10,15,17,21,22,24,26] by modifying the kernel buddy system while avoiding expensive hardware changes to memory controllers or cache hierarchies.…”
Section: Page-coloring Based Memory Managementmentioning
confidence: 99%
“…With page-coloring, one can mitigate the contention problem [4,10,15,17,21,22,24,26] by modifying the kernel buddy system while avoiding expensive hardware changes to memory controllers or cache hierarchies.…”
Section: Page-coloring Based Memory Managementmentioning
confidence: 99%
“…A third configuration uses private LLCs. Finally, we consider an S-NUCA configuration in which the blocks are mapped to the L2 banks using a first touch policy [29]. The first time a block is requested, the memory page containing that block is mapped to the L2 bank in the requestor's tile.…”
Section: Performance Evaluationmentioning
confidence: 99%
“…OS-based techniques to achieve a better mapping of the cache blocks to the LLC banks have been proposed by Cho et al [29], Ros et al [20], Das et al…”
Section: Related Workmentioning
confidence: 99%
“…All results are normalized to that of an ideal interconnect, in which we do not model any routing delay, contention, or queuing delays. We model only the wire delay over the manhattan distance between the sender and receiver node (30ps/mm [32] [4], [15], [24] to reduce traffic Benchmarks Used Splash-2 [35] barnes (ba), cholesky (ch), fft (ff), fmm (fm) lu (lu), ocean (oc), radiosity (rs), radix (rx) raytrace (ry), water-spatial (ws) Parsec [7] blackscholes (bl), fluidanimate (fl) Other em3d (em), ilink (il), jacobi (ja) mp3d (mp), shallow (sh), tsp (ts) The reason for TLLB's performance is its latency. In a medium-scale CMP like the one simulated here, the overall throughput demand seldom overwhelms the shared bus.…”
Section: B Experimental Analysismentioning
confidence: 99%