Synchronization Mechanisms on Modern Multi-core Architectures

Liu, Shaoshan; Gaudiot, Jean‐Luc

doi:10.1007/978-3-540-74309-5_28

Cited by 11 publications

(4 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the literature there is a large number of studies comparing several locking implementations targeting large-scale multiprocessors (e.g. [8,2,9,7]). In this paper we investigate the potential of advanced hardware facilities to implement novel, lightweight locking mechanisms.…”

Section: Benchmark Applicationsmentioning

confidence: 99%

See 1 more Smart Citation

Evaluation of Architectural Supports for Fine-Grained Synchronization Mechanisms

Matteis¹,

Luporini²,

Mencagli³

et al. 2013

Artificial Intelligence and Applications / 794: Modelling, Identification and Control / 795: Parallel and Distributed Computing

View full text Add to dashboard Cite

The advent of multi-/many-core architectures demands efficient\ud \ud run-time supports to sustain parallel applications\ud \ud scalability. Synchronization mechanisms should be optimized\ud \ud in order to account for different scenarios, such\ud \ud as the interaction between threads executed on different\ud \ud cores as well as intra-core synchronization, i.e. involving\ud \ud threads executed on hardware contexts of the same core.\ud \ud In this perspective, we describe the design issues of two\ud \ud notable mechanisms for shared-memory parallel computations.\ud \ud We point out how specific architectural supports, like\ud \ud hardware cache coherence and core-to-core interconnection\ud \ud networks, make it possible to design optimized implementations\ud \ud of such mechanisms. In this paper we discuss\ud \ud experimental results on three representative architectures:\ud \ud a flagship Intel multi-core and two interesting network processors.\ud \ud The final result helps to untangle the complex implementation\ud \ud space of synchronization mechanisms.\ud \ud KEY WORDS\ud \ud Synchronization, Locking, Simultaneous Multi-Threading,\ud \ud Busy-Waiting, Multi-cores, Network Processors

show abstract

Section: Benchmark Applicationsmentioning

confidence: 99%

“…Also for the locking mechanism the busy-waiting phase is extremely critical: as demonstrated in [8,7], aggressive techniques generate a huge number of cache invalidation messages. Other techniques try to reach a desired tradeoff between the mechanism responsiveness and the induced network traffic on the interconnection structure.…”

Section: Benchmark Applicationsmentioning

confidence: 99%

Evaluation of Architectural Supports for Fine-Grained Synchronization Mechanisms

Matteis¹,

Luporini²,

Mencagli³

et al. 2013

Artificial Intelligence and Applications / 794: Modelling, Identification and Control / 795: Parallel and Distributed Computing

View full text Add to dashboard Cite

show abstract

“…This model extends Request-Store-Forward (RSF) Synchronization Model [16] to reduce the overhead of managing shared resources. The rationale of this model is as follows.…”

Section: Thread Migrationmentioning

confidence: 99%

nuKernel

Shih

Lai

2013

Proceedings of the 28th Annual ACM Symposium on Applied Computing

View full text Add to dashboard Cite

The demands of modern embedded systems are hastening the adoption of multicore SoCs. Although multicore SoCs can be conceptually viewed as distributed systems, the resources on multicore SoCs including interrupts and scheduling are mostly, if not all, managed by the operating systems on general purpose CPU on SoC in a centralized manner. This approach leads to heavy overhead on the general purpose CPU and does not scale up. This paper presents the design and implementation of a microkernel for multi-core DSP SoCs, named nµKernel, to support real-time scheduling, load sharing among DSP cores, nested priority interrupts, and predictable interrupt latency jitter. The kernel takes advantage of both the shared memory architecture on multicore DSP SoCs and pipeline real-time scheduling to support load sharing. A server-based algorithm is designed for overrun control and to reduce load sharing overhead. The developed hybrid interrupt handling framework adopts on-demand interrupt thread mechanism to reduce interrupt handling overhead and support nested priority interrupts. The experiments show that the kernel can significantly enhance application performance with least management overhead. The frame rate of a secure image display application speeds up for six times: from 2.2 frames per second to 19 frames per second while workload are shared among DSP cores. The developed interrupt handling framework shortens the interrupted latency for up to 90%, compared to two-level interrupt handling mechanism and limits the range of interrupt latency to no more than 5%.

show abstract

“…The primary technique that has emerged as a candidate for inclusion in commercial multicore chips is hardware transactional memory (HTM) [9]. Under this assumption, execution can aggressively proceed.…”

Section: Bus Lockingmentioning

confidence: 99%

Multicore Synchronization Hardware

Holt¹

2013

Real World Multicore Embedded Systems

View full text Add to dashboard Cite

Synchronization Mechanisms on Modern Multi-core Architectures

Cited by 11 publications

References 14 publications

Evaluation of Architectural Supports for Fine-Grained Synchronization Mechanisms

Evaluation of Architectural Supports for Fine-Grained Synchronization Mechanisms

nuKernel

Multicore Synchronization Hardware

Contact Info

Product

Resources

About