1995
DOI: 10.1007/bf02577866
|View full text |Cite
|
Sign up to set email alerts
|

A scalable method for run-time loop parallelization

Abstract: Current parallelizing compilers do a reasonable job of extracting parallelism from programs with regular, well behaved, statically analyzable access patterns. However, they cannot extract a significant fraction of the available parallelism if the program has a complex and/or statically insufficiently defined access pattern, e.g., simulation programs with irregular domains and/or dynamically changing interactions. Since such programs represent a large fraction of all applications, techniques are needed for extr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2000
2000
2013
2013

Publication Types

Select...
6

Relationship

4
2

Authors

Journals

citations
Cited by 56 publications
(35 citation statements)
references
References 31 publications
0
35
0
Order By: Relevance
“…This executable will include code for adaptive run-time techniques that allow the application to make on-the-fly decisions about various optimizations. To this end, we will use our techniques for detecting and exploiting loop level parallelism in various cases encountered in irregular applications [24,27,26]. Load balancing will be achieved through feedback guided blocked scheduling [11] which allows highly imbalanced loops to be block scheduled by predicting a good work distribution from previous measured execution times of iteration blocks.…”
Section: System Architecturementioning
confidence: 99%
See 2 more Smart Citations
“…This executable will include code for adaptive run-time techniques that allow the application to make on-the-fly decisions about various optimizations. To this end, we will use our techniques for detecting and exploiting loop level parallelism in various cases encountered in irregular applications [24,27,26]. Load balancing will be achieved through feedback guided blocked scheduling [11] which allows highly imbalanced loops to be block scheduled by predicting a good work distribution from previous measured execution times of iteration blocks.…”
Section: System Architecturementioning
confidence: 99%
“…We have developed several techniques [24][25][26][27] that can detect and exploit loop level parallelism in various cases encountered in irregular applications: (i) a speculative method to detect fully parallel loops (The LRPD Test), (ii) an inspector/executor technique to compute wavefronts (sequences of mutually independent sets of iterations that can be executed in parallel) and (iii) a technique for parallelizing while loops (do loops with an unknown number of iterations and/or containing linked list traversals). We now briefly describe the utility of some of these techniques; details of their design can be found in [25][26][27]11] and other related publications.…”
Section: Run-time Parallelizationmentioning
confidence: 99%
See 1 more Smart Citation
“…We have developed several techniques [13,14,15] that can detect and exploit loop level parallelism in various cases encountered in irregular applications: (i) a speculative method to detect fully parallel loops (The LRPD Test), (ii) an inspector/executor technique to compute wavefronts (sequences of mutually independent sets of iterations that can be executed in parallel) and (iii) a technique for parallelizing while loops (do loops with an unknown number of iterations and/or containing linked list traversals). In this paper we will mostly refer to the LRPD test and how it is used to detect fully parallel loops.…”
Section: Foundational Work -The Lrpd Test For Dense Problemsmentioning
confidence: 99%
“…(b) Run-time Analysis techniques which analyze the code memory references during program execution and decide if an optimization (e.g., parallelization) can be applied. Notable examples are the TLS (thread-level speculation) [22] and inspector/executor [23] techniques, which analyze dynamically memory reference traces to detect data dependencies. Run-time techniques are effective because they can extract most available parallelism, but exhibit significant overhead.…”
Section: Introductionmentioning
confidence: 99%