PR-STM: Priority Rule Based Software Transactions for the GPU

Shen, Qi; Sharp, Craig; Blewitt, William; Ushaw, Gary; Morgan, Graham

doi:10.1007/978-3-662-48096-0_28

Cited by 8 publications

(3 citation statements)

References 18 publications

(25 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A second relevant observation is that architectural differences of CPUs and (discrete) GPUs have a great impact on their programming models and, as such, HeTM systems should keep these aspects into account to attain high efficiency. One key issue is that, differently from CPUs, where transactions are typically executed individually, in GPUs it is desirable to execute transactions in relatively large batches [50], [7], as this allows for: i) amortizing the latency of transactions' activation; ii) enhancing throughput when transferring to/from the GPU the inputs/output required/produced by transactions' execution; iii) improving resource utilization on modern GPUs.…”

Section: A Architecture and Programming Modelmentioning

confidence: 99%

“…Supported libraries. Currently, SHeTM supports three TM implementations: two on the CPU side -TinySTM [15] and Intel's TSX [29], implemented respectively in software and hardware -and one on the GPU side, namely PR-STM [50].…”

Section: B Integration With Guest Tm Librariesmentioning

confidence: 99%

“…As mentioned, the literature in the area of TM has elaborated a plethora of design, exploring both hardware and software implementations. The majority of the existing literature has focused on investigating TM implementations for CPUs, although a number of TM systems for GPUs [18], [27], [57], [50] have been explored of late. In this area, a relevant related work is the recent APUTM [54], which addressed the problem of implementing a STM for integrated GPUs.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

HeTM: Transactional Memory for Heterogeneous Systems

Castro

Romano

Ilić

et al. 2019

2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)

View full text Add to dashboard Cite

Modern heterogeneous computing architectures, which couple multi-core CPUs with discrete many-core GPUs (or other specialized hardware accelerators), enable unprecedented peak performance and energy efficiency levels. Unfortunately, though, developing applications that can take full advantage of the potential of heterogeneous systems is a notoriously hard task. This work takes a step towards reducing the complexity of programming heterogeneous systems by introducing the abstraction of Heterogeneous Transactional Memory (HeTM). HeTM provides programmers with the illusion of a single memory region, shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with support for atomic transactions. Besides introducing the abstract semantics and programming model of HeTM, we present the design and evaluation of a concrete implementation of the proposed abstraction, which we named Speculative HeTM (SHeTM). SHeTM makes use of a novel design that leverages on speculative techniques and aims at hiding the inherently large communication latency between CPUs and discrete GPUs and at minimizing inter-device synchronization overhead. SHeTM is based on a modular and extensible design that allows for easily integrating alternative TM implementations on the CPU's and GPU's sides, which allows the flexibility to adopt, on either side, the TM implementation (e.g., in hardware or software) that best fits the applications' workload and the architectural characteristics of the processing unit. We demonstrate the efficiency of the SHeTM via an extensive quantitative study based both on synthetic benchmarks and on a porting of a popular object caching system.

show abstract

Section: A Architecture and Programming Modelmentioning

confidence: 99%

Section: B Integration With Guest Tm Librariesmentioning

confidence: 99%