Proceedings of the 47th International Conference on Parallel Processing 2018
DOI: 10.1145/3225058.3225085
|View full text |Cite
|
Sign up to set email alerts
|

Combining Task-based Parallelism and Adaptive Mesh Refinement Techniques in Molecular Dynamics Simulations

Abstract: Modern parallel architectures require applications to generate massive parallelism so as to feed their large number of cores and their wide vector units. We revisit the extensively studied classical Molecular Dynamics N-body problem in the light of these hardware constraints. We use Adaptive Mesh Refinement techniques to store particles in memory, and to optimize the force computation loop using multi-threading and vectorization-friendly data structures. Our design is guided by the need for load balancing and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 29 publications
0
7
0
Order By: Relevance
“…SFC-based domain decompositions yield an efficient partitioning scheme for stencil-like algorithms [200,201]. Adaptive variants, however, have not consistently proven to be beneficial [202][203][204], because of the more involved neighbor search. Additionally, in a short-range MD simulation, the main portion of the computational load is generated by the number of force pairs and only loosely coupled to the number of cells.…”
Section: Molecular Dynamicsmentioning
confidence: 99%
“…SFC-based domain decompositions yield an efficient partitioning scheme for stencil-like algorithms [200,201]. Adaptive variants, however, have not consistently proven to be beneficial [202][203][204], because of the more involved neighbor search. Additionally, in a short-range MD simulation, the main portion of the computational load is generated by the number of force pairs and only loosely coupled to the number of cells.…”
Section: Molecular Dynamicsmentioning
confidence: 99%
“…Regarding affinity, it can be seen that it is advisable to select one of the available strategies instead of delegating the distribution to the operating system (none). Unlike scatter, balanced and compact guarantee the proximity among OpenMP threads with consecutive identifiers, minimizing in this way the data communication that each thread requires 6 . As it was mentioned in Section 4.3, the compiler detects false dependencies in that loop and it is not able to generate SIMD binary code by itself.…”
Section: Performance Results On the Intel Xeon Phi 7230mentioning
confidence: 99%
“…Nowadays, the scientific community is experimenting with a new revolution on parallel processor technologies in the road to the Exascale. The novelties and enhancements not only involve hardware technologies but also changes in parallel programming models [6]. Beyond that, one of the most important challenges that still remains is how to perform large-scale simulations in a reasonable time using affordable computer systems.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…A communication thread per rank is used to coordinate the work-stealing, while OpenMP tasks conduct computations. Prat et al [22] studied the taskification of computations in AMR applications using OpenMP tasks and dependencies, combined with cache blocking and vectorization techniques. However, they did not include the study of communication patterns.…”
Section: Related Workmentioning
confidence: 99%