2015
DOI: 10.1145/2687414
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Parallelogram Selection for Hierarchical Tiling

Abstract: Loop tiling is an effective optimization to improve performance of multiply nested loops, which are the most time-consuming parts in many programs. Most massively parallel systems today are organized hierarchically, and different levels of the hierarchy differ in the organization of parallelism and the memory models they adopt. To make better use of these machines, it is clear that loop nests should be tiled hierarchically to fit the hierarchical organization of the machine; however, it is not so clear what sh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
3
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 23 publications
0
3
0
Order By: Relevance
“…We do not believe that parallelogram tiles perform better than any of these in general, and the diamond tiling evaluation provides some evidence of this [6]. Our results motivate further work in surveying all known tile shapes, revisiting the Zhou et al [36] shape selection results; we are still missing a comprehensive understanding of the relative merits of each shape, depending on the application, dataset, and target architecture.…”
Section: Performance On Gpu Architecturesmentioning
confidence: 71%
“…We do not believe that parallelogram tiles perform better than any of these in general, and the diamond tiling evaluation provides some evidence of this [6]. Our results motivate further work in surveying all known tile shapes, revisiting the Zhou et al [36] shape selection results; we are still missing a comprehensive understanding of the relative merits of each shape, depending on the application, dataset, and target architecture.…”
Section: Performance On Gpu Architecturesmentioning
confidence: 71%
“…These problems are either computationally intensive, or working on large-scale multidimensional data or both [1,2]. Nested loops are one of the most time-consuming parts and the largest sources of parallelism in these problems [3,4]. In order to meet the ever-increasing computing requirement of scientific applications, it is necessary to use high-level computational capacity and optimization techniques.…”
mentioning
confidence: 99%
“…Loop optimization and parallelization have always been an important role to achieve higher performance [4]. A lot of loop optimization techniques have been developed to decrease the execution time of the nested loops and improve the performance.…”
mentioning
confidence: 99%