Proceedings of the Third ACM International Workshop on Many-Core Embedded Systems 2016
DOI: 10.1145/2934495.2934499
|View full text |Cite
|
Sign up to set email alerts
|

Dataflow Implementation of QR Decomposition on a Manycore

Abstract: While parallel computer architectures have become mainstream, application development on them is still challenging. There is a need for new tools, languages and programming models. Additionally, there is a lack of knowledge about the performance of parallel approaches of basic but important operations, such as the QR decomposition of a matrix, on current commercial manycore architectures. This paper evaluates a high level dataflow language (CAL), a source-to-source compiler (Cal2Many) and three QR decompositio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…In this step, an important decision is the choice of NoC topology. The efficiency of the topology might change based on the domain of the applications However, for dataflow applications, we suggest a 2D mesh structure based on our experience during previous work [22,58] with the Epiphany architecture [59]. This structure provides efficient core-to-core communication for dataflow applications in terms of bandwidth and latency.…”
Section: System Integrationmentioning
confidence: 99%
See 2 more Smart Citations
“…In this step, an important decision is the choice of NoC topology. The efficiency of the topology might change based on the domain of the applications However, for dataflow applications, we suggest a 2D mesh structure based on our experience during previous work [22,58] with the Epiphany architecture [59]. This structure provides efficient core-to-core communication for dataflow applications in terms of bandwidth and latency.…”
Section: System Integrationmentioning
confidence: 99%
“…It is also a part of the solution to the linear least squares problem and the basis of an eigenvalue algorithm (the QR algorithm). There are several different methods to perform QRD, such as the Givens Rotations, Householder and Gram-Schmidt methods [22].…”
Section: Qr Decompositionmentioning
confidence: 99%
See 1 more Smart Citation
“…With their technique, Savas et al [125] observe speedup between 1.3 and 4.3 over hand-written code. In [126], Savas et al compare CAL-optimized scientific applications deployed on Epiphany [1] to hand-written GNU scientific library running on a unique ARM processor and observe a x30 speedup. They further note that the development of ap-6.5.…”
Section: Related Workmentioning
confidence: 99%