2010
DOI: 10.1145/1785414.1785443
|View full text |Cite
|
Sign up to set email alerts
|

x86-TSO

Abstract: Exploiting the multiprocessors that have recently become ubiquitous requires high-performance and reliable concurrent systems code, for concurrent data structures, operating system kernels, synchronization libraries, compilers, and so on. However, concurrent programming, which is always challenging, is made much more so by two problems. First, real multiprocessors typically do not provide the sequentially consistent memory that is assumed by most work on semantics and verification. Instead, they have relaxed m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
53
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 345 publications
(53 citation statements)
references
References 23 publications
0
53
0
Order By: Relevance
“…We implemented all of the lock algorithms and benchmarks in C and C++ compiled with GCC 4.7.1 at optimization level -O3 in 32-bit mode. As required, we inserted memory fences to support the memory model on x86 [Sewell et al 2010] and SPARC where a store and load in program order can be reordered by the architecture. While not shown in our pseudo-code, padding and alignment were added in the data structures to avoid false sharing.…”
Section: Empirical Evaluationmentioning
confidence: 99%
“…We implemented all of the lock algorithms and benchmarks in C and C++ compiled with GCC 4.7.1 at optimization level -O3 in 32-bit mode. As required, we inserted memory fences to support the memory model on x86 [Sewell et al 2010] and SPARC where a store and load in program order can be reordered by the architecture. While not shown in our pseudo-code, padding and alignment were added in the data structures to avoid false sharing.…”
Section: Empirical Evaluationmentioning
confidence: 99%
“…The total-store-ordering semantics [20] provides that all instructions are executed in program order (or, more precisely, cannot be observed to be executed out of program order), each write is visible either globally or only to its own thread, and writes become globally visible in program order. Consequently, visibility-and execution-order edges compile away to nothing, and pushes can be implemented by mfences.…”
Section: Methodsmentioning
confidence: 99%
“…We take as our reference points the (broadly similar) Power and ARM architectures, and the x86 architecture, because they enjoy rigorous, usable specifications [19,5,20]. We focus on the former because in all cases relevant to this paper, the complexities of Power and ARM subsume those of x86.…”
Section: The Rmc Memory Modelmentioning
confidence: 99%
See 2 more Smart Citations