Josep Torrellas scite author profile

ÐMuch emphasis is now placed on chip-multiprocessor (CMP) architectures for exploiting thread-level parallelism in an application. In such architectures, speculation may be employed to execute applications that cannot be parallelized statically. In this paper, we present an efficient CMP architecture for speculative execution of sequential binaries without source recompilation. We present the software support that enables identification of threads from a sequential binary. The hardware includes a memory disambiguation mechanism that enables the detection of interthread memory dependence violations during speculative execution. This hardware is different from past proposals in that it does not rely on a snoopy-based cache-coherence protocol. Instead, it uses an approach similar to a directory-based scheme. Furthermore, the architecture includes a simple and efficient hardware mechanism to enable register-level communication between on-chip processors. Evaluation of this software-hardware approach shows that it is quite effective in achieving high performance when running sequential binaries.

show abstract

Bulk Disambiguation of Speculative Threads in Multiprocessors

Ceze

Tuck

Torrellas

et al. 2006

SIGARCH Comput. Archit. News

137

178

View full text Add to dashboard Cite

Transactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data dependences across threads requires mechanisms for disambiguating addresses across threads, invalidating stale cache state, and making committed state visible. These mechanisms are both conceptually involved and hard to implement. In this paper, we present Bulk, a novel approach to simplify these mechanisms. The idea is to hash-encode a thread's access information in a concise signature, and then support in hardware signature operations that efficiently process sets of addresses. Such operations implement the mechanisms described. Bulk operations are inexact but correct, and provide substantial conceptual and implementation simplicity. We evaluate Bulk in the context of TLS using SPECint2000 codes and TM using multithreaded Java workloads. Despite its simplicity, Bulk has competitive performance with more complex schemes. We also find that signature configuration is a key design parameter.

show abstract

ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors

View full text Add to dashboard Cite

Speculative Taint Tracking (STT): A Comprehensive Protection for Speculatively Accessed Data

et al. 2020

View full text Add to dashboard Cite

Facelift: Hiding and slowing down aging in multicores

2008

View full text Add to dashboard Cite

Positional adaptation of processors: application to energy reduction

Huang¹,

Renau²,

Torrellas³

120

View full text Add to dashboard Cite

Although adaptive processors can exploit application variability to improve performance or save energy, effectively managing their adaptivity is challenging. To address this problem, we introduce a new approach to adaptivity: the Positional approach. In this approach, both the testing of configurations and the application of the chosen configurations are associated with particular code sections. This is in contrast to the currently-used Temporal approach to adaptation, where both the testing and application of configurations are tied to successive intervals in time.We propose to use subroutines as the granularity of code sections in positional adaptation. Moreover, we design three implementations of subroutine-based positional adaptation that target energy reduction in three different workload environments: embedded or specialized server, general purpose, and highly dynamic. All three implementations of positional adaptation are much more effective than temporal schemes. On average, they boost the energy savings of applications by 50% and 84% over temporal schemes in two experiments.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Josep Torrellas

VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects

InvisiSpec: Making Speculative Execution Invisible in the Cache Hierarchy

A chip-multiprocessor architecture with speculative multithreading

Bulk Disambiguation of Speculative Threads in Multiprocessors

ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors

Speculative Taint Tracking (STT): A Comprehensive Protection for Speculatively Accessed Data

Facelift: Hiding and slowing down aging in multicores

Positional adaptation of processors: application to energy reduction

Contact Info

Product

Resources

About