2013
DOI: 10.1007/s11554-012-0317-y
|View full text |Cite
|
Sign up to set email alerts
|

An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 12 publications
0
5
0
Order By: Relevance
“…In order to test the applicability of our method for practical use, the second implementation was based on the x264 codec, named LBRL-x264, compared to the unmodified model of x264. x264 is known as the fastest CPU implementation of video compression [42]. It is the most commonly used codec in the practice, including applications on UAVs and video satellite platforms.…”
Section: Experiments With Uav Video Clipsmentioning
confidence: 99%
“…In order to test the applicability of our method for practical use, the second implementation was based on the x264 codec, named LBRL-x264, compared to the unmodified model of x264. x264 is known as the fastest CPU implementation of video compression [42]. It is the most commonly used codec in the practice, including applications on UAVs and video satellite platforms.…”
Section: Experiments With Uav Video Clipsmentioning
confidence: 99%
“…Moreover, the approaches that distribute the load of a single module across CPU and GPU devices, usually perform an exhaustive search over all possible distributions and/or rely on simplified module/device performance models. In detail, the load distribution is found i) by a large set of experiments for optimal "sub-frame" pipelining in [6]; ii) with constant compute-only performance parametrization in a single-GPU platform [4]; and iii) by intersecting the fitted full performance curves (experimentally obtained before module execution) for each device in the system [9]. Moreover, several works apply simplistic equidistant data partitioning of CF/RFs in homogeneous multi-GPU environments [10] without considering CPUs for computing.…”
Section: Background and Related Workmentioning
confidence: 99%
“…This scheme enabled concurrent deblocking filtering with limited synchronization effort, independently of slice configuration. Several works have focused on the use of GPU to accelerate the ME process for H.264/AVC [9][10][11][12][13][14]. Most GPUbased ME algorithms employ the full-search method because it is suitable for the SIMD (single instruction and multiple data) architecture of GPU.…”
Section: Introductionmentioning
confidence: 99%
“…They use the motion vector (MV) of the co-located macroblock as the SCP for the current macroblock. Ko et al developed a fast 9264 encoder with a GPU-based ME algorithm on CPU plus GPU platforms [13]. A pipelining technique called subframe ME processing is introduced to effectively hide the communication overhead between CPU and GPU.…”
Section: Introductionmentioning
confidence: 99%