2019
DOI: 10.1155/2019/6825728
|View full text |Cite
|
Sign up to set email alerts
|

Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC

Abstract: The UPC programming language offers parallelism via logically partitioned shared memory, which typically spans physically disjoint memory sub-systems. One convenient feature of UPC is its ability to automatically execute between-thread data movement, such that the entire content of a shared data array appears to be freely accessible by all the threads. The programmer friendliness, however, can come at the cost of substantial performance penalties. This is especially true when indirectly indexing the elements o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(13 citation statements)
references
References 19 publications
(32 reference statements)
0
13
0
Order By: Relevance
“…We use an existing UPC code 1 that was tested in previous work [13]. The UPC++ implementation is derived from this UPC implementation.…”
Section: Heat Equationmentioning
confidence: 99%
See 3 more Smart Citations
“…We use an existing UPC code 1 that was tested in previous work [13]. The UPC++ implementation is derived from this UPC implementation.…”
Section: Heat Equationmentioning
confidence: 99%
“…Thus, for storing the values in the off-diagonal part A, it is usual to use two 1D arrays both of length n • r nz (instead of two n × r nz 2D tables). Therefore, one 1D array contains the off-diagonal values consecutively row by row, whereas the other 1D array contains the corresponding integer column indices [13].…”
Section: Sparse Matrix-vector Multiplicationmentioning
confidence: 99%
See 2 more Smart Citations
“…Many of the related works made experiments based on NASA advanced computing (NAS) benchmarks (Curtis Maury et al, 2006; Freeh et al, 2007; Li et al, 2010; Marathe et al, 2015; Sundriyal et al, 2014) that are widely accepted as representative of scientific applications. Lagravière et al (2015) investigate the MFlops/Watt metric between MPI, unified parallel C (UPC), and OpenMP. Schöne and Molka (2014) propose a framework to dynamically change configuration of hardware and software.…”
Section: Related Workmentioning
confidence: 99%