Wenjia Ruan scite author profile

Spear

2015

Hybrid Transactional Memory (TM) uses available hardware TM resources to execute language-level transactions, and falls back to a software TM implementation for those transactions that cannot complete in hardware. Ideally, a hybrid TM would allow hardware and software transactions to run concurrently, but would not waste hardware TM resources on coordination between the two classes of transactions. In addition, it should scale well, incur little latency, offer strong safety guarantees, and provide some degree of fairness. We introduce a new hybrid TM algorithm, "Hybrid Cohorts", in which hardware transactions do not modify global metadata, and software transactions have extremely low per-access overhead. The tradeoff is that hardware transactions cannot commit while software transactions are in flight. Evaluation on an 8-thread Intel Haswell CPU shows competitive performance with the current state-of-the-art. Furthermore, it does so while providing acceptable levels of fairness and safety, and offering opportunities for hardware acceleration. 2 The Hybrid Cohorts Approach HyTM algorithms that descend from NOrec share two key properties: First, the opacity of each software transaction (STx) is preserved by requiring every hardware transac

Boosting timestamp-based transactional memory by exploiting hardware cycle counters

ACM Trans. Archit. Code Optim.

Liu

Spear

2013

Time-based transactional memories typically rely on a shared memory counter to ensure consistency. Unfortunately, such a counter can become a bottleneck. In this article, we identify properties of hardware cycle counters that allow their use in place of a shared memory counter. We then devise algorithms that exploit the x86 cycle counter to enable bottleneck-free transactional memory runtime systems. We also consider the impact of privatization safety and hardware ordering constraints on the correctness, performance, and generality of our algorithms.

Transactional Read-Modify-Write Without Aborts

ACM Trans. Archit. Code Optim.

Liu

Spear

2015

Language-level transactions are said to provide "atomicity," implying that the order of operations within a transaction should be invisible to concurrent transactions and thus that independent operations within a transaction should be safe to execute in any order. In this article, we present a mechanism for dynamically reordering memory operations within a transaction so that read-modify-write operations on highly contended locations can be delayed until the very end of the transaction. When integrated with traditional transactional conflict detection mechanisms, our approach reduces aborts on hot memory locations, such as statistics counters, thereby improving throughput and reducing wasted work. We present three algorithms for delaying highly contended read-modify-write operations within transactions, and we evaluate their impact on throughput for eager and lazy transactional systems across multiple workloads. We also discuss complications that arise from the interaction between our mechanism and the need for strong language-level semantics, and we propose algorithmic extensions that prevent errors from occurring when accesses are aggressively reordered in a transactional memory implementation with weak semantics.

Boosting timestamp-based transactional memory by exploiting hardware cycle counters

Ruan¹,

Liu²,

Spear³

2013

TACO

Value prediction for security (VPsec): Countering fault attacks in modern microprocessors

Sheikh

Cammarota

2018