2012
DOI: 10.1140/epjst/e2012-01645-8
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of different parallel implementations of the 2+1-dimensional KPZ model and the 3-dimensional KMC model

Abstract: We show that efficient simulations of the Kardar-Parisi-Zhang interface growth in 2 + 1 dimensions and of the 3-dimensional Kinetic Monte Carlo of thermally activated diffusion can be realized both on GPUs and modern CPUs. In this article we present results of different implementations on GPUs using CUDA and OpenCL and also on CPUs using OpenCL and MPI. We investigate the runtime and scaling behavior on different architectures to find optimal solutions for solving current simulation problems in the field of st… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(19 citation statements)
references
References 45 publications
1
18
0
Order By: Relevance
“…For this work we added the capability to perform simulations with arbitrary probabilities p and q. Benchmarks, comparing our GPU implementation on a Tesla C2070 to the optimized sequential CPU implementation running on an Intel Xeon X5650 at 2.67 GHz, have shown a speedup factor of about 230 for the raw simulation. The basic version from [47], which contains less computational effort per update, reaches a raw simulation speedup of about 100, in the same setup.…”
Section: Bit-coded Graphics Processing Unit Algorithmsmentioning
confidence: 96%
See 1 more Smart Citation
“…For this work we added the capability to perform simulations with arbitrary probabilities p and q. Benchmarks, comparing our GPU implementation on a Tesla C2070 to the optimized sequential CPU implementation running on an Intel Xeon X5650 at 2.67 GHz, have shown a speedup factor of about 230 for the raw simulation. The basic version from [47], which contains less computational effort per update, reaches a raw simulation speedup of about 100, in the same setup.…”
Section: Bit-coded Graphics Processing Unit Algorithmsmentioning
confidence: 96%
“…A more detailed description of our CUDA implementation can be found in [46,47]. For this work we added the capability to perform simulations with arbitrary probabilities p and q. Benchmarks, comparing our GPU implementation on a Tesla C2070 to the optimized sequential CPU implementation running on an Intel Xeon X5650 at 2.67 GHz, have shown a speedup factor of about 230 for the raw simulation.…”
Section: Bit-coded Graphics Processing Unit Algorithmsmentioning
confidence: 99%
“…A double-tiling scheme was applied by splitting up the simulation cells into tiles, split further into two subtiles along each spatial direction [34]. In the present two-dimensional problem this yields 2 d = 4 sets of subtiles, each of which can be updated by multiple independent workers.…”
Section: Models and Simulation Algorithmsmentioning
confidence: 99%
“…We tested the tool with various datasets. In the sponge dataset [2], which was already mentioned in the introduction section, we tackled the volumetric imaging of a highly complicated structure. In the dataset we used, the stoichiometry of SiO x was fixed to x = 1, i.e., SiO by setting the silicon excess to 30 vol.%.…”
Section: Benchmarksmentioning
confidence: 99%
“…For instance, the sponge dataset [2] is a material produced from silicate, which has interesting nano-technological properties. Very recently, it has been experimentally shown that a silicon-rich oxide film can decay into a silicon nanowire network embedded in SiO 2 by spinodal decomposition during rapid thermal treatment [3], which has also been confirmed by accompanying kinetic Monte Carlo simulations [4].…”
Section: Introductionmentioning
confidence: 99%