1996
DOI: 10.1109/71.508249
|View full text |Cite
|
Sign up to set email alerts
|

A unified framework for optimizing communication in data-parallel programs

Abstract: This paper presents a framework, based on global array data-ow analysis, to reduce communication costs in a program being compiled for a distributed memory machine. We introduce available section descriptor, a novel representation of communication involving array sections. This representation allows us to apply techniques for partial redundancy elimination to obtain powerful communication optimizations. With a single framework, we are able to capture optimizations like (i) vectorizing communication, (ii) elimi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0

Year Published

1996
1996
2016
2016

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 61 publications
(42 citation statements)
references
References 28 publications
0
36
0
Order By: Relevance
“…It is well-known [9], [10], [11], [12] that the overhead to access nonlocal data from remote processors on distributed memory architectures is commonly orders of magnitude higher than the cost of accessing local data. Communication overhead is, therefore, one of the most important metrics in choosing an appropriate data distribution.…”
Section: 1mentioning
confidence: 99%
“…It is well-known [9], [10], [11], [12] that the overhead to access nonlocal data from remote processors on distributed memory architectures is commonly orders of magnitude higher than the cost of accessing local data. Communication overhead is, therefore, one of the most important metrics in choosing an appropriate data distribution.…”
Section: 1mentioning
confidence: 99%
“…When using static analysis for data coalescing in Unified Parallel C [14,8] and High Performance Fortran [13,25], the compiler identifies, through data and control flow analysis, shared accesses to specific threads and creates a single run-time call to access multiple data items from the same thread. However, existing solutions do not completely remove the calls.…”
Section: Related Workmentioning
confidence: 99%
“…Proposed methods to improve fine-grained communication in PGAS languages include inspector-executor transformation [27,11,5], static coalescing [14,8,25], limited privatization [10,15], and software caching [39]. However, a big hurdle in the code generation of UPC language is that the compiler ends up inserting runtime calls to transform UPC "shared" accesses into requests for data (or actions) to other address partitions.…”
Section: Introductionmentioning
confidence: 99%
“…the SUIF project [40 -42] and the Paradign compiler [43,44]). However, they target an architecture which is very different from ours.…”
Section: Memory Optimisation In Multi-threaded Applicationsmentioning
confidence: 99%