Manycore challenge in particle-in-cell simulation: How to exploit 1 TFlops peak performance for simulation codes with irregular computation

Nakashima, Hiroshi

doi:10.1016/j.compeleceng.2015.03.010

Cited by 12 publications

(7 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While the computational characteristics of PIC codes are fairly well understood, much of the communities optimization and development efforts to date have been focused around platform and hardware specific optimization [31], [32]. As heterogeneity within leading supercomputers increases [33], it is no longer viable to periodically optimize for a single target platform, making portability more important than ever.…”

Section: Related Workmentioning

confidence: 99%

VPIC 2.0: Next Generation Particle-in-Cell Simulations

Bird

Tan

Luedtke

et al. 2022

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

VPIC is a general purpose particle-in-cell simulation code for modeling plasma phenomena such as magnetic reconnection, fusion, solar weather, and laser-plasma interaction in three dimensions using large numbers of particles. VPIC's capacity in both fidelity and scale makes it particularly well-suited for plasma research on pre-exascale and exascale platforms. In this paper we demonstrate the unique challenges involved in preparing the VPIC code for operation at exascale, outlining important optimizations to make VPIC efficient on accelerators. Specifically, we show the work undertaken in adapting VPIC to exploit the portability-enabling framework Kokkos and highlight the enhancements to VPIC's modeling capabilities to achieve performance at exascale. We assess the achieved performance-portability trade-off through a suite of studies on nine different varieties of modern preexascale hardware. Our performance-portability study includes weakscaling runs on three of the top ten TOP500 supercomputers, as well as a comparison of low-level system performance of hardware from four different vendors.

show abstract

Section: Related Workmentioning

confidence: 99%

VPIC 2.0: Next Generation Particle-in-Cell Simulations

Bird

Tan

Luedtke

et al. 2022

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

show abstract

“…On the other hand, high performance computing (HPC) optimizations intend to improve the algorithm efficiency and scalability in order to reach higher resolutions and therefore better accuracy. This includes sorting techniques [5][6][7], vectorization [8,9], load balancing [7,10], advanced programming models [11], etc. These improvements are done at the cost of an increased complexity which sometimes prevent their combination.…”

Section: Introductionmentioning

confidence: 99%

Single Domain Multiple Decompositions for Particle-in-Cell simulations

Derouillat,

Beck

2019

Preprint

View full text Add to dashboard Cite

As a multi-purpose Particle-In-Cell (PIC) code, Smilei gathers many different features in a single software. Combining some of them is challenging. In particular, spectral solvers and patch based load balancing have a priori non compatible requirements. This paper introduces the Single Domain Multiple Decompositions (SDMD) method in order to address this issue. To do so, different domain decompositions are used for fields and particles operations. This approach allows to keep small domains for particles, necessary for a good load balancing, while having large domains for the fields. It proves beneficial in mitigating synchronization costs and gives the opportunity to introduce more paralellism in the PIC algorithm on top of providing structures compatible with spectral solvers.

show abstract

“…Although several successful porting of applications to Xeon Phi coprocessors have been reported recently [19][20][21], it is not certain if a significant portion of peak performance is achievable for a wide class of applications, and does efficient porting of an existing application require code tuning in scope of OpenMP or rather massive rewriting similar to porting to GPUs. A brief analysis shows that although a straightforward porting can be done very quickly even for a large application, it will be efficient only if the application was properly optimized for CPUs and has a large degree of parallelism on thread-level and SIMD-level.…”

Section: Introductionmentioning

confidence: 99%

Particle-in-Cell laser-plasma simulation on Xeon Phi coprocessors

Surmin

Bastrakov

Efimenko

et al. 2016

Computer Physics Communications

View full text Add to dashboard Cite

a b s t r a c tThis paper concerns the development of a high-performance implementation of the Particle-in-Cell method for plasma simulation on Intel Xeon Phi coprocessors. We discuss the suitability of the method for Xeon Phi architecture and present our experience in the porting and optimization of the existing parallel Particle-in-Cell code PICADOR. Direct porting without code modification gives performance on Xeon Phi close to that of an 8-core CPU on a benchmark problem with 50 particles per cell. We demonstrate step-by-step optimization techniques, such as improving data locality, enhancing parallelization efficiency and vectorization leading to an overall 4.2× speedup on CPU and 7.5× on Xeon Phi compared to the baseline version. The optimized version achieves 16.9 ns per particle update on an Intel Xeon E5-2660 CPU and 9.3 ns per particle update on an Intel Xeon Phi 5110P. For a real problem of laser ion acceleration in targets with surface grating, where a large number of macroparticles per cell is required, the speedup of Xeon Phi compared to CPU is 1.6×.

show abstract

Manycore challenge in particle-in-cell simulation: How to exploit 1 TFlops peak performance for simulation codes with irregular computation

Cited by 12 publications

References 20 publications

VPIC 2.0: Next Generation Particle-in-Cell Simulations

VPIC 2.0: Next Generation Particle-in-Cell Simulations

Single Domain Multiple Decompositions for Particle-in-Cell simulations

Particle-in-Cell laser-plasma simulation on Xeon Phi coprocessors

Contact Info

Product

Resources

About