2012
DOI: 10.1145/2370036.2145849
|View full text |Cite
|
Sign up to set email alerts
|

Revisiting the combining synchronization technique

Abstract: Fine-grain thread synchronization has been proved, in several cases, to be outperformed by efficient implementations of the combining technique where a single thread, called the combiner , holding a coarse-grain lock, serves, in addition to its own synchronization request, active requests announced by other threads while they are waiting by performing some form of spinning. Efficient implementations of this technique significantly reduce the cost of synchronization, so in many cases the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
122
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 71 publications
(123 citation statements)
references
References 19 publications
0
122
0
Order By: Relevance
“…Besides the optimized server-based solutions that implement Algorithm 1, we also evaluate CC-Synch [5], as a representative of combining approaches, as well as H-Synch, its NUMA-aware version. H-Synch follows the general idea of grouping operations originating from the same node and executing them together in batches, thus incurring fewer cross-socket cache line transfers and significantly increasing throughput.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Besides the optimized server-based solutions that implement Algorithm 1, we also evaluate CC-Synch [5], as a representative of combining approaches, as well as H-Synch, its NUMA-aware version. H-Synch follows the general idea of grouping operations originating from the same node and executing them together in batches, thus incurring fewer cross-socket cache line transfers and significantly increasing throughput.…”
Section: Methodsmentioning
confidence: 99%
“…This turned out to result in unfavorable interference in our experiments, which we avoid by skipping every second cache line when allocating client slots. In experiments where memory management is needed (stacks and queues), cache-aligned memory chunks are allocated and deallocated using per-thread pools (we use the implementation provided by the authors of CC-Synch [5]). …”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations