2005
DOI: 10.1504/ijhpcn.2005.009429
|View full text |Cite
|
Sign up to set email alerts
|

Fast indexing for blocked array layouts to reduce cache misses

Abstract: Abstract-The increasing disparity between memory latency and processor speed is a critical bottleneck in achieving high performance. Recently, several studies have been conducted on blocked data layouts, in conjunction with loop tiling to improve locality of references. In this paper, we further reduce cache misses, restructuring the memory layout of multi-dimensional arrays, so that array elements are stored in a blocked way, exactly as they are swept by the tiled instruction stream. A straightforward way is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 34 publications
0
2
0
Order By: Relevance
“…There has been significant work on runtime effects due to cache performance; however, most of this research focuses on minimizing cache misses [1,2,8,17,18,19]. By minimizing cache misses, energy spent in accessing memory is decreased, and the overall application runtime is improved.…”
Section: Related Workmentioning
confidence: 99%
“…There has been significant work on runtime effects due to cache performance; however, most of this research focuses on minimizing cache misses [1,2,8,17,18,19]. By minimizing cache misses, energy spent in accessing memory is decreased, and the overall application runtime is improved.…”
Section: Related Workmentioning
confidence: 99%
“…(2) The approach cannot support simultaneous parallel small tile access. Although other studies() have used the Z‐Morton layout to exploit 2‐D data locality and reduce conflict misses, those approaches have increased the cycle time or the access latency due to the versatility of the Morton‐index translations. In contrast, Lim and Thottethodi proposed a hardware‐based bit‐permuting unit to translate the raster scan order address to a Z‐Morton address.…”
Section: Introductionmentioning
confidence: 99%