2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture 2010
DOI: 10.1109/micro.2010.51
|View full text |Cite
|
Sign up to set email alerts
|

Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior

Abstract: In a modern chip-multiprocessor system, memory is a shared resource among multiple concurrently executing threads. The memory scheduling algorithm should resolve memory contention by arbitrating memory access in such a way that competing threads progress at a relatively fast and even pace, resulting in high system throughput and fairness. Previously proposed memory scheduling algorithms are predominantly optimized for only one of these objectives: no scheduling algorithm provides the best system throughput and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

7
377
1

Year Published

2013
2013
2019
2019

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 363 publications
(390 citation statements)
references
References 21 publications
7
377
1
Order By: Relevance
“…These designs do not aim to provide worst-case bounds and can underestimate memory interference. Future memory controllers might incorporate ideas like batching and thread prioritization (e.g., [28,19]). This will lead to a different analysis, which could be interesting future work that builds on ours.…”
Section: Related Workmentioning
confidence: 99%
“…These designs do not aim to provide worst-case bounds and can underestimate memory interference. Future memory controllers might incorporate ideas like batching and thread prioritization (e.g., [28,19]). This will lead to a different analysis, which could be interesting future work that builds on ours.…”
Section: Related Workmentioning
confidence: 99%
“…Each rank consists of multiple banks that share an internal bus for reading/writing data. 3 Because each bank acts as an independent entity, banks can serve multiple memory requests in parallel, offering banklevel parallelism [17,21,32]. A DRAM bank is further sub-divided into multiple subarrays [18,37,44] as shown in Figure 2.…”
Section: Dram System Organizationmentioning
confidence: 99%
“…All programs are compiled by gcc 4.4.3 with the -O3 optimizations. Similar to previous work, we use weighted speedup [12] (WS) to measure system performance and maximum slowdown (MS) [12] for fairness: We compare several memory allocation schemes including the unmodified paging system in the Linux kernel, utilitybased partitioning [15], DRAM bank partitioning [16], random allocation [18] and our proposed HVR system. Figure 11 shows that for workloads that benefit from cacheonly and bank-only partitioning (50 workloads in Quadrant I of Figure 3), VP can accumulate the performance gains.…”
Section: Experimental Methodologymentioning
confidence: 99%
“…Previous research efforts [12,29] show that contention can significantly degrade the overall system performance and many solutions have been proposed to mitigate the contention problems.…”
Section: Page-coloring Based Memory Managementmentioning
confidence: 99%
See 1 more Smart Citation