2018
DOI: 10.48550/arxiv.1811.09916
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generating Realistic Training Images Based on Tonality-Alignment Generative Adversarial Networks for Hand Pose Estimation

Abstract: Hand pose estimation from a monocular RGB image is an important but challenging task. A main factor affecting its performance is the lack of a sufficiently large training dataset with accurate hand-keypoint annotations. In this work, we circumvent this problem by proposing an effective method for generating realistic hand poses, and show that state-of-the-art algorithms for hand pose estimation can be greatly improved by utilizing the generated hand poses as training data. Specifically, we first adopt an augme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 43 publications
0
4
0
Order By: Relevance
“…InterHand2.6M is captured in a multi-camera studio consisting of 80-140 cameras capturing at 30-90 frames-per-second (fps), and 350-450 directional LED point lights directed at the hand to promote uniform illumination 3 . The cameras captured at image resolution 4096 × 2668.…”
Section: Data Capturementioning
confidence: 99%
“…InterHand2.6M is captured in a multi-camera studio consisting of 80-140 cameras capturing at 30-90 frames-per-second (fps), and 350-450 directional LED point lights directed at the hand to promote uniform illumination 3 . The cameras captured at image resolution 4096 × 2668.…”
Section: Data Capturementioning
confidence: 99%
“…RGB cameras are much more widely used than depth sensors. Estimating 3D hand poses merely from monocu-lar RGB images are more practical and active in the literature [5,10,19,29,37,41,49,24,23,11]. The pioneering work by Zimmermann and Brox [49] utilizes convolutional neural networks (CNN) to extract image feature, and feed camera parameters with these features to a 3D lift network where depth information is then estimated.…”
Section: Rgb Based 3d Hand Pose Estimationmentioning
confidence: 99%
“…Since depth images contain surface geometry information of hands, they are widely used for hand pose estimation in the literature [40,44,11,41,14,16,27,8,9]. Most existing work adopts regression to fit the parameters of a deformed hand model [30,22,24,40].…”
Section: D Hand Pose Estimation From Depth Imagesmentioning
confidence: 99%