The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2011 23rd International Symposium on Computer Architecture and High Performance Computing 2011
DOI: 10.1109/sbac-pad.2011.19
|View full text |Cite
|
Sign up to set email alerts
|

Applying CUDA Architecture to Accelerate Full Search Block Matching Algorithm for High Performance Motion Estimation in Video Encoding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 6 publications
0
4
0
Order By: Relevance
“…Therefore, most of the parallel ME research work is on many-core concentrates on the full-search method, which is inherently highly parallel. In Chen and Hang [15]; Cheng et al [16]; Lee and Oh [17]; Monteiro et al [18], the parallel fullsearch method is implemented on the GPU platform, and about 10-100x speed-ups are, respectively, obtained compared with the serial full-search method on single core of a CPU. Although the speed-up of the full-search method is high on GPU platform, its performance advantage is not obvious compared with serial fast search method in HEVC or H.264/AVC on single core of a CPU.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, most of the parallel ME research work is on many-core concentrates on the full-search method, which is inherently highly parallel. In Chen and Hang [15]; Cheng et al [16]; Lee and Oh [17]; Monteiro et al [18], the parallel fullsearch method is implemented on the GPU platform, and about 10-100x speed-ups are, respectively, obtained compared with the serial full-search method on single core of a CPU. Although the speed-up of the full-search method is high on GPU platform, its performance advantage is not obvious compared with serial fast search method in HEVC or H.264/AVC on single core of a CPU.…”
Section: Related Workmentioning
confidence: 99%
“…Initial efforts for the parallelization of the block matching algorithms in old parallel processing platforms have been presented in [1,11,12,14,15,16]. Specifically, as far as the HS algorithm is concerned, very few research works have been published, referring to systolic arrays [5,11,12] and not modern parallelization frameworks.…”
Section: Related Workmentioning
confidence: 99%
“…They have used either OpenMP or OpenMPI or GPU [16], or custom architectures [15] or special FPGA architectures [14]. None of them have ever tried to parallelize a block matching algorithm on an existing high performance embedded system, like a smart mobile phone.…”
Section: Introductionmentioning
confidence: 99%
“…There are many methods based on various approaches including gray based [3], frequency based [4] or feature based [5] methods. Also, motion estimation can be computed in 2D [6] [7], which is suitable for long monitoring distances in outdoor conditions or 3D [8], which are suitable for low focal distances, where obvious changes in parallaxes inducted by 3D viewpoint translations occurs.…”
Section: Introductionmentioning
confidence: 99%