2008
DOI: 10.1093/ietisy/e91-d.12.2902
|View full text |Cite
|
Sign up to set email alerts
|

Cache Optimization for H.264/AVC Motion Compensation

Abstract: SUMMARYIn this letter, we propose a cache organization that substantially reduces the memory bandwidth of motion compensation (MC) in the H.264/AVC decoders. To reduce duplicated memory accesses to P and B pictures, we employ a four-way set-associative cache in which its index bits are composed of horizontal and vertical address bits of the frame buffer and each line stores an 8 × 2 pixel data in the reference frames. Moreover, we alleviate the data fragmentation problem by selecting its line size that equals … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 7 publications
(9 reference statements)
0
8
0
Order By: Relevance
“…by implementing special instructions for the FIFO access. Additionally, we should also decrease T T by the following schemes: (1) adopting a highbandwidth DRAM and an efficient SDRAM controller [4], (2) using a wider memory data bus, and (3) linking multiple DMA transfers for inter-prediction of a macroblock where each DMA transfer is arranged to deliver the minimum number of pixels for its corresponding sub-block of the macroblock in H.264 [6]. Using the scheme (3) can reduce the required memory bandwidth width substantially because the data transfer for inter-prediction occupies 73.4% of the total bandwidth of DMA data transfers, which corresponds to 59.6% of the total bandwidth to the SDRAM.…”
Section: Resultsmentioning
confidence: 99%
“…by implementing special instructions for the FIFO access. Additionally, we should also decrease T T by the following schemes: (1) adopting a highbandwidth DRAM and an efficient SDRAM controller [4], (2) using a wider memory data bus, and (3) linking multiple DMA transfers for inter-prediction of a macroblock where each DMA transfer is arranged to deliver the minimum number of pixels for its corresponding sub-block of the macroblock in H.264 [6]. Using the scheme (3) can reduce the required memory bandwidth width substantially because the data transfer for inter-prediction occupies 73.4% of the total bandwidth of DMA data transfers, which corresponds to 59.6% of the total bandwidth to the SDRAM.…”
Section: Resultsmentioning
confidence: 99%
“…Since the luma 4 × 4 block represents the most demanding case with respect to memory accesses [3] and computational intensity for q-pel MC, the focus will be put on this type of block and its associated operations to prove the efficiency of the proposed method for a standard H.264 decoder.…”
Section: Problem Definitionmentioning
confidence: 99%
“…Wang [2] and Yoon [3] concluded that MC requires 75% of all memory access in a H.264 decoder, in contrast with only 10% required for storing the frames. This high memory access ratio of the MC module demands for highly optimized memory accesses to improve the total performance of the decoder.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations