2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2016
DOI: 10.1109/ipdpsw.2016.104
|View full text |Cite
|
Sign up to set email alerts
|

Refactoring Conventional Task Schedulers to Exploit Asymmetric ARM big.LITTLE Architectures in Dense Linear Algebra

Abstract: Dealing with asymmetry in the architecture opens a plethora of questions from the perspective of scheduling task-parallel applications, and there exist early attempts to address this problem via ad-hoc strategies embedded into a runtime framework. In this paper we take a different path, which consists in addressing the complexity of the problem at the library level, via a few asymmetry-aware fundamental kernels, hiding the architecture heterogeneity from the task scheduler. For the specific domain of dense lin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
8
0
1

Year Published

2016
2016
2018
2018

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(10 citation statements)
references
References 15 publications
1
8
0
1
Order By: Relevance
“…In the Odroid XU4 board, the improvements obtained are even worse than expected, if they are compared with the fact that the parallel version is using eight cores instead of four. Nevertheless, previous works have proved that using LITTLE cores has not a high impact over the performance when compared with the big ones, even increasing the execution time in some cases …”
Section: Accelerating the Execution Through Parallelizationmentioning
confidence: 98%
See 1 more Smart Citation
“…In the Odroid XU4 board, the improvements obtained are even worse than expected, if they are compared with the fact that the parallel version is using eight cores instead of four. Nevertheless, previous works have proved that using LITTLE cores has not a high impact over the performance when compared with the big ones, even increasing the execution time in some cases …”
Section: Accelerating the Execution Through Parallelizationmentioning
confidence: 98%
“…Basically, this programming model is based on including directives (#pragmas), as in other parallel programming models, such as OpenMP. These directives are mostly used to annotate certain code blocks in order to inform that those blocks are tasks; that is, basic scheduling units to be used by the available computational resources …”
Section: Accelerating the Execution Through Parallelizationmentioning
confidence: 99%
“…I). Thus, actions will be the following: [3,2], [9,20] [6,4], [16,40],{x 3 ,m6 6 ,m7 9 }) | Φ(v 5 ,{m15 5 }) → ( [6,4], [16,40],{x 4 [7,4], [18,40],{x 7 ,m11 7 }) | Φ(v 3 ,{m17 3 }) → ( [7,4], [18,40],{x 8 …”
Section: B Functional Specification Of Distributed Systemsmentioning
confidence: 99%
“…A 5 : Φ(v 6 ,{m6 6 }) → ( [5,3], [13,29],{x 9 ,m13 7 ,m14 8 }) | Φ(v 6 ,{m16 6 }) → ( [5,3], [13,29],{x 10 ,m15 5 ,m14 8 } A 6 :Φ(v 7 ,{m11 7 }) → ( [9,5], [23,50],{x 11 ,m16 6 }) | Φ(v 7 ,{m13 7 }) → ( [9,5], [23,50],{x 12 [5,3], [12,30],{m28 5 …”
Section: M12 2 })mentioning
confidence: 99%
See 1 more Smart Citation