Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays 2009
DOI: 10.1145/1508128.1508139
|View full text |Cite
|
Sign up to set email alerts
|

A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation

Abstract: The future of high-performance computing is likely to rely on the ability to efficiently exploit huge amounts of parallelism. One way of taking advantage of this parallelism is to formulate problems as "embarrassingly parallel" MonteCarlo simulations, which allow applications to achieve a linear speedup over multiple computational nodes, without requiring a super-linear increase in inter-node communication. However, such applications are reliant on a cheap supply of high quality random numbers, particularly fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
78
0
4

Year Published

2009
2009
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 141 publications
(85 citation statements)
references
References 23 publications
(34 reference statements)
0
78
0
4
Order By: Relevance
“…In the source code, flux is represented as an array with five elements, and each element can be accessed by using its number as an element, e.g. f lux 3 . Like f lux, f lux sum is a two-dimensional array for which the index indicates a vector element and a grid number.…”
Section: Target Subroutinesmentioning
confidence: 99%
See 2 more Smart Citations
“…In the source code, flux is represented as an array with five elements, and each element can be accessed by using its number as an element, e.g. f lux 3 . Like f lux, f lux sum is a two-dimensional array for which the index indicates a vector element and a grid number.…”
Section: Target Subroutinesmentioning
confidence: 99%
“…Such data dependency reduces the computational efficiency because data must wait to be processed when they are being processed, otherwise, the results are incorrect [16]. In addition, degraded performance is also observed in other subroutines: (2) Green Gauss, (3) …”
Section: Target Subroutinesmentioning
confidence: 99%
See 1 more Smart Citation
“…Both approaches are not CPU-free and the numbers are not generated on demand. GPU-based random numbers generators are discussed by Nguyen (2007) and Thomas et al (2009). Bastos-Filho et al (2010) presented a CPU-free approach for generating random numbers on demand based on the Xorshift generator (Marsaglia (2003)).…”
Section: Synchronization Barriersmentioning
confidence: 99%
“…Floating-point arithmetic implementation on FPGA is inefficient for the same reason [12]. Often, a careful decision among alternative algorithms is necessary for optimal performance [7].…”
Section: A Application Domainsmentioning
confidence: 99%