2011 International Conference on Parallel Architectures and Compilation Techniques 2011
DOI: 10.1109/pact.2011.21
|View full text |Cite
|
Sign up to set email alerts
|

DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
147
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 150 publications
(150 citation statements)
references
References 53 publications
1
147
0
Order By: Relevance
“…Several cache-coherence optimizations reduce the cost of updates, though that is not their primary purpose: self-invalidations, done with either hardware predictors [43] or software protocols [16,33], remove invalidations from the critical path; adaptive-granularity coherence schemes [38,67,71] reduce both false sharing and the amount of dirty data sent on invalidations; and speculation and fast networks can reduce the cost of atomic operations [27]. These schemes are orthogonal to Coup, which could be used in conjunction with them to improve performance.…”
Section: Additional Related Workmentioning
confidence: 99%
“…Several cache-coherence optimizations reduce the cost of updates, though that is not their primary purpose: self-invalidations, done with either hardware predictors [43] or software protocols [16,33], remove invalidations from the critical path; adaptive-granularity coherence schemes [38,67,71] reduce both false sharing and the amount of dirty data sent on invalidations; and speculation and fast networks can reduce the cost of atomic operations [27]. These schemes are orthogonal to Coup, which could be used in conjunction with them to improve performance.…”
Section: Additional Related Workmentioning
confidence: 99%
“…SC-for-DRF protocols rely on the guarantee that, during DRF regions, threads perform either private or read-only memory accesses [1], [2], [20]. A memory access is private if it targets a memory location that is only accessed by one thread during the execution of one DRF region; and is read-only if the location is not written within the DRF region.…”
Section: A Sequential Consistency For Drf Protocolsmentioning
confidence: 99%
“…This excessive invalidation limits their performance [1], [2]. In contrast, SPEL reduces self-invalidation, by relying on the compiler to indicate the points of synchronization that indeed require self-invalidating cached data.…”
Section: A Sequential Consistency For Drf Protocolsmentioning
confidence: 99%
See 2 more Smart Citations