Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Syste 2020
DOI: 10.1145/3373376.3378468
|View full text |Cite
|
Sign up to set email alerts
|

Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines

Abstract: Multi-socket machines with 1-100 TBs of physical memory are becoming prevalent. Applications running on multi-socket machines suffer non-uniform bandwidth and latency when accessing physical memory. Decades of research have focused on data allocation and placement policies in NUMA settings, but there have been no studies on the question of how to place page-tables amongst sockets. We make the case for explicit page-table allocation policies and show that pagetable placement is becoming crucial to overall perfo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 39 publications
(31 citation statements)
references
References 51 publications
0
30
0
Order By: Relevance
“…There has been a tremendous amount of work aimed at improving translation range and efficiency (and thereby reducing the number of page walks) [8,16,21,23,25,27,33,33,37,38,[42][43][44][45][46]53]. Other works have focused on reducing the TLB miss penalty by improving the page table walk caches [14,17,18], using speculation to hide latency [5,8,15,47], optimizing hash page tables [52], and replicating page tables across NUMA nodes [3]. For virtualized systems, Gandhi et al proposed merging the 2D page table into a single dimension where possible [26].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…There has been a tremendous amount of work aimed at improving translation range and efficiency (and thereby reducing the number of page walks) [8,16,21,23,25,27,33,33,37,38,[42][43][44][45][46]53]. Other works have focused on reducing the TLB miss penalty by improving the page table walk caches [14,17,18], using speculation to hide latency [5,8,15,47], optimizing hash page tables [52], and replicating page tables across NUMA nodes [3]. For virtualized systems, Gandhi et al proposed merging the 2D page table into a single dimension where possible [26].…”
Section: Related Workmentioning
confidence: 99%
“…Biasing the replacement policy to favor page table entries means evicting more data, but we find that applications with high TLB miss rates also exhibit high data miss rates (L2 and L3 data miss ratios of 95% and 80%). This, combined with the page table access being on the critical path to the data access, suggests that allocating more cache space to the (much smaller) page table over the data itself is likely to be more beneficial than caching the data 3 .…”
Section: Cache Prioritizationmentioning
confidence: 99%
“…Migration and replication of data pages and page-tables are commonly used to ameliorate the performance impact of NUMA [2,23,87,94] eects, but policies depend critically on access frequency metadata. When a single access bit is read periodically to determine the hotness of an entire 2MB region, pages can easily appear articially hot.…”
Section: Motivationmentioning
confidence: 99%
“…We periodically pause and capture memory metadata at points in the execution that are the same in both execution cases. 2 For GPUs, capturing metadata and establishing correspondence is considerably simpler, as the prototype runs in a simulator: a single trace of memory references and instruction counts is captured and post-processed to produce a set of epochs and snapshots of per-page metadata.…”
Section: Metadata Fidelitymentioning
confidence: 99%
See 1 more Smart Citation