2011
DOI: 10.1145/1993316.1993554
|View full text |Cite
|
Sign up to set email alerts
|

Automatic parallelization via matrix multiplication

Abstract: Existing work that deals with parallelization of complicated reductions and scans focuses only on formalism and hardly dealt with implementation. To bridge the gap between formalism and implementation, we have integrated parallelization via matrix multiplication into compiler construction. Our framework can deal with complicated loops that existing techniques in compilers cannot parallelize. Moreover, we have sophisticated our framework by developing two sets of techniques. One enhances its capability for para… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 20 publications
(32 citation statements)
references
References 13 publications
0
32
0
Order By: Relevance
“…As sublists ↑ 7 ,+ 7 is a monoid homomorphism we can execute it in parallel, say using p processors, which leads to the run time O((log p + n p )w 2 ). This complexity resembles the run time of other parallel algorithms to solve the knapsack problem, e.g., one given by [SI11]. The standard sequential algorithm has run time O(nw ).…”
Section: Developing Intuitions By Examplementioning
confidence: 97%
See 1 more Smart Citation
“…As sublists ↑ 7 ,+ 7 is a monoid homomorphism we can execute it in parallel, say using p processors, which leads to the run time O((log p + n p )w 2 ). This complexity resembles the run time of other parallel algorithms to solve the knapsack problem, e.g., one given by [SI11]. The standard sequential algorithm has run time O(nw ).…”
Section: Developing Intuitions By Examplementioning
confidence: 97%
“…This is in contrast to the specification which, when executed, would generate an intermediate result of size | S | 2n . Interestingly, the derived program is equivalent to a program obtained by parallelizing the Viterbi algorithm [He88,HC06] using matrix multiplication over a semiring [SI11].…”
Section: Finding a Most Likely Sequence Of Hidden Statesmentioning
confidence: 99%
“…Their idea of parallelization was formalized as quantifier elimination and generalized to cover recursive functions by Morihata and Matsuzaki [16]. Sato and Iwasaki [18] presented a lightweight method based on Fisher et al's paper [9] that formalizes a given loop body as matrix multiplication over semiring after extracting maximum operators. Complicated loops of reduce and scan that these studies dealt with are not in the scope of our system.…”
Section: Related Workmentioning
confidence: 99%
“…Complicated loops of reduce and scan that these studies dealt with are not in the scope of our system. A parallelization strategy of extracting loops to be converted to skeletons, which we have adopted, has been presented [18]. In this sense, our contribution to this work is not on automatic parallelization techniques.…”
Section: Related Workmentioning
confidence: 99%
“…Sato and Iwasaki [27] address the parallelization of complex reductions and scans. They transform the loop body into a matrix-multiplication form based on reduce and scan parallel primitives.…”
Section: Related Workmentioning
confidence: 99%