Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation 2009
DOI: 10.1145/1542476.1542496
|View full text |Cite
|
Sign up to set email alerts
|

Towards a holistic approach to auto-parallelization

Abstract: Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread application. This is largely due to the poor exploitation of application parallelism, subsequently resulting in performance levels far below those which a skilled expert programmer could achieve. We have identified two weaknesses in traditional parallelizing compilers and propose a novel, integrated approach, resulting in significant performance improvements of the generated parallel code. Using profile-driven paral… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2010
2010
2022
2022

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 132 publications
(10 citation statements)
references
References 42 publications
0
8
0
Order By: Relevance
“…Because of the complexity of control and data flow in such programs, a compiler cannot easily infer the distance between a loop iteration that generates data and the ones that consume it. For conventional synchronization approaches [6,25,26,43,47,48], this assumption of dependences between all subsequent iterations leads to sequential chains that severely limit the performance sought by running loop iterations in parallel. 3 These sequential chains, which include both communication and computation, have two sources of inefficiency.…”
Section: Opportunitymentioning
confidence: 99%
See 1 more Smart Citation
“…Because of the complexity of control and data flow in such programs, a compiler cannot easily infer the distance between a loop iteration that generates data and the ones that consume it. For conventional synchronization approaches [6,25,26,43,47,48], this assumption of dependences between all subsequent iterations leads to sequential chains that severely limit the performance sought by running loop iterations in parallel. 3 These sequential chains, which include both communication and computation, have two sources of inefficiency.…”
Section: Opportunitymentioning
confidence: 99%
“…Automatic parallelization of non-numerical programs. Several automatic methods to extract TLP have demonstrated respectable speedups on commodity multicore processors for non-numerical programs [6,16,27,29,30,43,49]. All of these methods transform loops into parallel threads.…”
Section: Related Workmentioning
confidence: 99%
“…Another line of work [3,28,44,46] extracts parallelism by ignoring data dependences without preserving soundness via misspeculation detection and recovery. These approaches extract parallelism either by sacrificing the program's output quality [3,28,46] or by depending on user approval [44]. Instead, Perspective extracts parallelism without violating the sequential program semantics.…”
Section: Related Workmentioning
confidence: 99%
“…A related technique is applied in the context of speculative parallelization of loops, where dynamic dependences across loop iterations are tracked [Rauchwerger and Padua 1995]. A few recent approaches of similar nature include Bridges et al [2007], Tian et al [2008], Zhong et al [2008], Wu et al [2008], Oancea and Mycroft [2008], and Tournavitis et al [2009]. To estimate parallel speedup of DAGs, Sarkar and Hennessy [1986] developed convex partitioning of DAGs.…”
Section: Related Workmentioning
confidence: 99%