2016
DOI: 10.1002/cta.2308
|View full text |Cite
|
Sign up to set email alerts
|

Design exploration of efficient implementation on SoC heterogeneous platform: HEVC intra prediction application

Abstract: The relationship between CPU and hardware accelerator is critical especially in some systems that require intensive tasks and large amount of data to deal with such as video coding systems. This cooperation provides significant improvements in run-time speed and power consumption. As software (SW) and hardware (HW) solutions provide better flexibility and performance, HW/SW implementation has emerged as a more efficient and desirable methodology for real-time implementation. In order to evaluate different impl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

3
5

Authors

Journals

citations
Cited by 18 publications
(11 citation statements)
references
References 15 publications
0
11
0
Order By: Relevance
“…The methodology is ideal for incorporating into an FPGA or ASIC design flow, which has suggested a methodology that integrates standard building blocks into safe compound gates. Register Transfer Level is a level of abstraction of the effective framework used by M.Kammoun et al [11] in order to evaluate the various methods of hardware and software in terms of energy consumption and execution time, especially for video coding systems, as large data processing is required. A FPGA (Field Programmable Gate Arrays) based on Xilinx Zynq was used for this search.…”
Section: Abstraction Levels Classificationmentioning
confidence: 99%
“…The methodology is ideal for incorporating into an FPGA or ASIC design flow, which has suggested a methodology that integrates standard building blocks into safe compound gates. Register Transfer Level is a level of abstraction of the effective framework used by M.Kammoun et al [11] in order to evaluate the various methods of hardware and software in terms of energy consumption and execution time, especially for video coding systems, as large data processing is required. A FPGA (Field Programmable Gate Arrays) based on Xilinx Zynq was used for this search.…”
Section: Abstraction Levels Classificationmentioning
confidence: 99%
“…11. On the other hand, the work presented in [33] proved that the AXI4-Lite bus performances are limited since it provides a sequential transfer of only 32-bit data. This generates a huge communication time and makes the transfer a bit slow.…”
Section: Sw/hw Solutions Exploration and Performance Evaluationmentioning
confidence: 99%
“…It is used to support all transform matrix sizes (from 4 × 4 to 32 × 32 TUs). The IDCT equations defined in the HEVC test model decoder reference software version 10 (HM10) [20] present many multiplications by constants, which is very costly in terms of area and power consumption [21]. To reduce the computational complexity and increase the performance of the proposed architecture, these multiplications are replaced by shift and addition operations.…”
Section: The 2d-idct/idst Hardware Designmentioning
confidence: 99%
“…However, the HW blocks are created and customized using the Xilinx platform studio. From Figure 11, we can see that the communication between the ARM Cortex A9 processor and the IQ/IT hardware block is carried out using the AXI4-stream interface [21,26] which is specially designed for maximum bandwidth access to the on-chip memory and DDR memory of the PS. This mode of transfer supports unlimited data burst sizes and provides point-to-point streaming data without using any addresses.…”
Section: Sw/hw Performance Evaluationmentioning
confidence: 99%