2014
DOI: 10.1145/2629500
|View full text |Cite
|
Sign up to set email alerts
|

Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks

Abstract: We present a novel method for real-time continuous pose recovery of markerless complex articulable objects from a single depth image. Our method consists of the following stages: a randomized decision forest classifier for image segmentation, a robust method for labeled dataset generation, a convolutional network for dense feature extraction, and finally an inverse kinematics stage for stable real-time pose recovery. As one possible application of this pipeline, we show state-of-the-art results for real-time p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
927
2
2

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 742 publications
(950 citation statements)
references
References 21 publications
1
927
2
2
Order By: Relevance
“…This requires extensive rendering of an explicit hand model in various poses. Tompson et al [29] use an (offline) PSO based approach to find the ground truth for the NYU dataset [29]. Since PSO depends highly on a good initialization, Gian et al [18] increase its robustness by combining it with ICP, while Taylor et al [26] suggest minimizing a truncated L1 error norm between the synthesized and real depth image while also rendering a more realistic-looking mesh through linear blend skinning (LBS) [11].…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…This requires extensive rendering of an explicit hand model in various poses. Tompson et al [29] use an (offline) PSO based approach to find the ground truth for the NYU dataset [29]. Since PSO depends highly on a good initialization, Gian et al [18] increase its robustness by combining it with ICP, while Taylor et al [26] suggest minimizing a truncated L1 error norm between the synthesized and real depth image while also rendering a more realistic-looking mesh through linear blend skinning (LBS) [11].…”
Section: Related Workmentioning
confidence: 99%
“…Recent approaches in 3D hand pose estimation from a single depth image are predominantly based on convolutional neural network architectures [29,15,32,6,22,30], typically requiring labelled data for training. While the accuracy of such methods has been disputed on a limited number of available datasets that are applicable to learningbased approaches, such as [29,2,23], the main problem seems to shift to a large degree towards scarcity of data labelling (e.g.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations