An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs

Ko, Youngsub; Yi, Youngmin; Ha, Soonhoi

doi:10.1007/s11554-012-0317-y

Cited by 8 publications

(5 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to test the applicability of our method for practical use, the second implementation was based on the x264 codec, named LBRL-x264, compared to the unmodified model of x264. x264 is known as the fastest CPU implementation of video compression [42]. It is the most commonly used codec in the practice, including applications on UAVs and video satellite platforms.…”

Section: Experiments With Uav Video Clipsmentioning

confidence: 99%

Towards Real-Time Service from Remote Sensing: Compression of Earth Observatory Video Data via Long-Term Background Referencing

Xiao

Zhu

et al. 2018

Remote Sensing

View full text Add to dashboard Cite

City surveillance enables many innovative applications of smart cities. However, the real-time utilization of remotely sensed surveillance data via unmanned aerial vehicles (UAVs) or video satellites is hindered by the considerable gap between the high data collection rate and the limited transmission bandwidth. High efficiency compression of the data is in high demand. Long-term background redundancy (LBR) (in contrast to local spatial/temporal redundancies in a single video clip) is a new form of redundancy common in Earth observatory video data (EOVD). LBR is induced by the repetition of static landscapes across multiple video clips and becomes significant as the number of video clips shot of the same area increases. Eliminating LBR improves EOVD coding efficiency considerably. First, this study proposes eliminating LBR by creating a long-term background referencing library (LBRL) containing high-definition geographically registered images of an entire area. Then, it analyzes the factors affecting the variations in the image representations of the background. Next, it proposes a method of generating references for encoding current video and develops the encoding and decoding framework for EOVD compression. Experimental results show that encoding UAV video clips with the proposed method saved an average of more than 54% bits using references generated under the same conditions. Bitrate savings reached 25-35% when applied to satellite video data with arbitrarily collected reference images. Applying the proposed coding method to EOVD will facilitate remote surveillance, which can foster the development of online smart city applications.

show abstract

Section: Experiments With Uav Video Clipsmentioning

confidence: 99%

Towards Real-Time Service from Remote Sensing: Compression of Earth Observatory Video Data via Long-Term Background Referencing

Xiao

Zhu

et al. 2018

Remote Sensing

View full text Add to dashboard Cite

show abstract

“…Moreover, the approaches that distribute the load of a single module across CPU and GPU devices, usually perform an exhaustive search over all possible distributions and/or rely on simplified module/device performance models. In detail, the load distribution is found i) by a large set of experiments for optimal "sub-frame" pipelining in [6]; ii) with constant compute-only performance parametrization in a single-GPU platform [4]; and iii) by intersecting the fitted full performance curves (experimentally obtained before module execution) for each device in the system [9]. Moreover, several works apply simplistic equidistant data partitioning of CF/RFs in homogeneous multi-GPU environments [10] without considering CPUs for computing.…”

Section: Background and Related Workmentioning

confidence: 99%

Adaptive Scheduling Framework for Real-Time Video Encoding on Heterogeneous Systems

Ilić

Momcilovic

Roma

et al. 2016

IEEE Trans. Circuits Syst. Video Technol.

View full text Add to dashboard Cite

In order to challenge real-time encoding of high definition video sequences on heterogenous desktop systems, a collaborative CPU+GPU framework for inter-loop video encoding is proposed herein. The proposed framework considers the overall complexity of the collaborative inter-loop encoding as a unified optimization problem. Several functional blocks are integrated for simultaneous execution control, automatic data access management, performance characterization, and adaptive scheduling and load balancing. These blocks aim at fully exploiting the performance of heterogeneous devices, asymmetric bandwidth of communication links and several levels of concurrency between computation and communication. To support a wide range of CPU and GPU architectures, a specific encoding library is developed with highly optimized algorithms for all interloop modules. The experimental results show that the proposed framework allows achieving a real-time encoding of full highdefinition sequences in several CPU+GPU systems. It also delivers performance improvements of up to 61.2% over the state-of-theart solution, while outperforming individual GPU and quad-core CPU executions for more than 2 and 5 times, respectively.

show abstract

“…This scheme enabled concurrent deblocking filtering with limited synchronization effort, independently of slice configuration. Several works have focused on the use of GPU to accelerate the ME process for H.264/AVC [9][10][11][12][13][14]. Most GPUbased ME algorithms employ the full-search method because it is suitable for the SIMD (single instruction and multiple data) architecture of GPU.…”

Section: Introductionmentioning

confidence: 99%

“…They use the motion vector (MV) of the co-located macroblock as the SCP for the current macroblock. Ko et al developed a fast 9264 encoder with a GPU-based ME algorithm on CPU plus GPU platforms [13]. A pipelining technique called subframe ME processing is introduced to effectively hide the communication overhead between CPU and GPU.…”

Section: Introductionmentioning

confidence: 99%

Fast motion estimation for HEVC on graphics processing unit (GPU)

Lee

Sim

Cho

et al. 2015

J Real-Time Image Proc

View full text Add to dashboard Cite

The recent video compression standard, HEVC (high efficiency video coding), will most likely be used in various applications in the near future. However, the encoding process is far too slow for real-time applications. At the same time, computing capabilities of GPUs (graphics processing units) have become more powerful in these days. In this paper, we have proposed a GPU-based parallel motion estimation (ME) algorithm to enhance the performance of an HEVC encoder. A frame is partitioned into two subframes for pipelined execution to improve GPU utilization. The flow chart is redetermined to solve data hazards in the pipelined execution. Two new methods are introduced in the proposed ME: decision of a representative search center position (RSCP) and warp-based concurrent parallel reduction (WCPR). A RSCP employs motion vectors of a co-located CTU in a previously encoded frame to solve a dependency problem in parallel computation with negligible coding loss. WCPR concurrently executes several parallel reduction operations, which increases the thread utilization from 20 to 89 % without any thread synchronization. The proposed encoder can make the portion of ME in the encoder negligible with 2.2 % bitrate increase against the HEVC test model (HM) encoder. In terms of ME, the proposed ME is 130.7 times faster than that of the HM encoder.

show abstract

An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs

Cited by 8 publications

References 12 publications

Towards Real-Time Service from Remote Sensing: Compression of Earth Observatory Video Data via Long-Term Background Referencing

Towards Real-Time Service from Remote Sensing: Compression of Earth Observatory Video Data via Long-Term Background Referencing

Adaptive Scheduling Framework for Real-Time Video Encoding on Heterogeneous Systems

Fast motion estimation for HEVC on graphics processing unit (GPU)

Contact Info

Product

Resources

About