Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2012
DOI: 10.1145/2145816.2145841
|View full text |Cite
|
Sign up to set email alerts
|

Deterministic parallel random-number generation for dynamic-multithreading platforms

Abstract: Existing concurrency platforms for dynamic multithreading do not provide repeatable parallel random-number generators. This paper proposes that a mechanism called pedigrees be built into the runtime system to enable efficient deterministic parallel randomnumber generation. Experiments with the open-source MIT Cilk runtime system show that the overhead for maintaining pedigrees is negligible. Specifically, on a suite of 10 benchmarks, the relative overhead of Cilk with pedigrees to the original Cilk has a geome… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(30 citation statements)
references
References 31 publications
0
29
0
Order By: Relevance
“…The NUMA extension supports non-commuting reductions [6] and pedigrees [16]. Both constructs depend on the execution order of function calls, which the helper function disrupts.…”
Section: F a Numa-aware Cilk Extensionmentioning
confidence: 99%
“…The NUMA extension supports non-commuting reductions [6] and pedigrees [16]. Both constructs depend on the execution order of function calls, which the helper function disrupts.…”
Section: F a Numa-aware Cilk Extensionmentioning
confidence: 99%
“…Figure 1 shows the performance loss of the reproducible implementation of transcendental functions with respect to the standard library installed on different systems, i.e., Glibc for the CPU and the CUDA toolkit for GPUs. The loss of performance is defined as the ratio of the time required by the deterministic implementation to the time required by the standard one on the corresponding platform to perform the task of evaluating the function on 2 22 input values. Figure 2 shows the geometric mean of the performance loss of all implemented functions for every architecture.…”
Section: Case Study: Standard Transcendental Functionsmentioning
confidence: 99%
“…Other studies on non-determinism caused by parallelism have been performed by Bergan et al [20], Bocchino et al [21], Leiserson et al [22], Olszewski et al [23].…”
Section: Related Workmentioning
confidence: 99%
“…Not only must the output of the program be deterministic, but all intermediate values returned from operations must also be deterministic. We note that this does not preclude the use of pseudorandom numbers, where one can use, for example, the approach of Leiserson et al [33] to generate deterministic pseudorandom numbers in parallel from a single seed, which can be part of the input.…”
Section: Programming Modelmentioning
confidence: 99%