2017 46th International Conference on Parallel Processing (ICPP) 2017
DOI: 10.1109/icpp.2017.52
|View full text |Cite
|
Sign up to set email alerts
|

Optimizations of Two Compute-Bound Scientific Kernels on the SW26010 Many-Core Processor

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 16 publications
0
13
0
Order By: Relevance
“…In recent years, we have started to see projects that refactor WENO for heterogeneous architectures, such as refactoring WENO on the CPUCGPU Table 6 Execution time of the entire WENO that we port onto the CPE clusters with different optimize configurations (see Fig. 11 architecture [31] , refactoring the community atmosphere model [12] , phase field simulations of coarsening dynamics [30] , 10 m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics [32] , molecular dynamics simulation [33] , sea ice model algorithms [34] , two compute-bound scientific kernels [35] , etc., on the Sunway TaihuLight system.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, we have started to see projects that refactor WENO for heterogeneous architectures, such as refactoring WENO on the CPUCGPU Table 6 Execution time of the entire WENO that we port onto the CPE clusters with different optimize configurations (see Fig. 11 architecture [31] , refactoring the community atmosphere model [12] , phase field simulations of coarsening dynamics [30] , 10 m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics [32] , molecular dynamics simulation [33] , sea ice model algorithms [34] , two compute-bound scientific kernels [35] , etc., on the Sunway TaihuLight system.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, there are multiple applications used in other disciplines that help double-buffering techniques to take samples (or transfer them between memories) while the data are processed in the same processor as Tan et al [33] suggest. This double-buffering technique has been used in image processing [39], paralleling processing [25] or ultrasound imaging methods [4]. However, our approach uses this double-buffering technique by adding a pipeline processing composed of timers, interruptions and a state machine.…”
Section: Data Acquisition Processesmentioning
confidence: 99%
“…Compared with recent research on regular problems on Sunway architecture, such as stencil [3], DNN [12], GEMM [18,27] and fully-implicit solver for nonhydrostatic atmospheric dynamics [66], our work presented in this paper is more complicated, as the irregularities from various matrix sparsity structures are dynamic and leveraging such irregularity is known to be more challenging [57,67].…”
Section: Related Workmentioning
confidence: 99%