2018
DOI: 10.1007/978-3-030-01252-6_8
|View full text |Cite
|
Sign up to set email alerts
|

Hand Pose Estimation via Latent 2.5D Heatmap Regression

Abstract: Estimating the 3D pose of a hand is an essential part of human-computer interaction. Estimating 3D pose using depth or multiview sensors has become easier with recent advances in computer vision, however, regressing pose from a single RGB image is much less straightforward. The main difficulty arises from the fact that 3D pose requires some form of depth estimates, which are ambiguous given only an RGB image. In this paper we propose a new method for 3D hand pose estimation from a monocular image through a nov… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

6
338
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 254 publications
(345 citation statements)
references
References 68 publications
6
338
0
Order By: Relevance
“…We trained a state-of-the-art network architecture [14] that takes as input an RGB image and predicts 3D keypoints on the training split of each of the datasets and report its performance on the evaluation split of all other datasets. For each dataset, we either use the standard training/evaluation split reported by the authors or create an 80%/20% split dataset num.…”
Section: Evaluation Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…We trained a state-of-the-art network architecture [14] that takes as input an RGB image and predicts 3D keypoints on the training split of each of the datasets and report its performance on the evaluation split of all other datasets. For each dataset, we either use the standard training/evaluation split reported by the authors or create an 80%/20% split dataset num.…”
Section: Evaluation Setupmentioning
confidence: 99%
“…Due to scale ambiguity, the problem to estimate real world 3D keypoint coordinates in a camera centered coordinate frame is ill-posed. Hence, we adopt the problem formulation of [14] to estimate coordinates in a root relative and scale normalized fashion:…”
Section: Evaluation Setupmentioning
confidence: 99%
“…Deep neural net architectures enable direct body location with pose prediction, which is an advantage compared to traditional modelbased methods that require good initialization [4,26]. Several methods predict 3D pose directly given monocular data [52,41,50,38,32,16,19,47]. On the other hand, many approaches lift 2D human poses [8,5], used as intermediate representation, and learn a model for 2D-3D pose space mapping [61,63,62,34,9].…”
Section: Related Workmentioning
confidence: 99%
“…Differently form our previous work in [34], we show that a volumetric representation is not required for 3D prediction. Similarly to methods on hand pose estimation [26] and on 3D human pose estimation [39], we predict 2D depth maps which encode the relative depth of each body joint.…”
Section: D Pose Estimationmentioning
confidence: 99%
“…Differently from our previous work [34], where volumetric heat maps were required to estimate the third dimension of body joints, here we use a similar apprach to [26], where specialized depth maps d are used to encode the depth information. Similarly to the probability maps decomposition from section 3.2.1, here we define d j as a depth map for the jth body joint.…”
Section: Depth Estimationmentioning
confidence: 99%