Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2013
DOI: 10.1145/2442516.2442532
|View full text |Cite
|
Sign up to set email alerts
|

NUMA-aware reader-writer locks

Abstract: Non-Uniform Memory Access (NUMA) architectures are gaining importance in mainstream computing systems due to the rapid growth of multi-core multi-chip machines. Extracting the best possible performance from these new machines will require us to revisit the design of the concurrent algorithms and synchronization primitives which form the building blocks of many of today's applications. This paper revisits one such critical synchronization primitive -the reader-writer lock.We present what is, to the best of our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0
1

Year Published

2013
2013
2019
2019

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 67 publications
(31 citation statements)
references
References 22 publications
0
30
0
1
Order By: Relevance
“…Even our optimized NO WAIT implementation does not scale as well as SILO due to contention caused by atomic instructions used in the read-write lock implementation (Figure 2). Designing scalable, NUMA-aware read-write lock is a topic of intense research in the concurrent programming community [6,10,16]. Using such locks to further minimize the impact of physical synchronization in both pessimistic and optimistic protocols is a promising direction of future research.…”
Section: Implications Of Our Analysismentioning
confidence: 99%
“…Even our optimized NO WAIT implementation does not scale as well as SILO due to contention caused by atomic instructions used in the read-write lock implementation (Figure 2). Designing scalable, NUMA-aware read-write lock is a topic of intense research in the concurrent programming community [6,10,16]. Using such locks to further minimize the impact of physical synchronization in both pessimistic and optimistic protocols is a promising direction of future research.…”
Section: Implications Of Our Analysismentioning
confidence: 99%
“…For example, our pipeline runs 8 instances (two per socket) of the CNN code for membrane detection, where each instance uses 9 cores. This enabled efficient use of the caches on each socket and eliminated the need to handle complex NUMA overheads [5,13,14,16,47].…”
Section: Scalable Software Saves Memorymentioning
confidence: 99%
“…Scalable synchronization structures typically rely on efficient inter-core communication using atomic operations. Since an atomic operation becomes much slower over inter-socket links, proposals for scalable NUMAaware locks rely on hierarchically partitioned structures to maximize access locality [9][10]. On the system level, a recent study on the performance of garbage collectors on multisocket multicores analyzes synchronization patterns and systematically removes bottlenecks without completely redesigning the system [11].…”
Section: A Multisocket Multicoresmentioning
confidence: 99%