2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016
DOI: 10.1109/cvpr.2016.537
|View full text |Cite
|
Sign up to set email alerts
|

Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video

Abstract: This paper addresses the challenge of 3D full-body human pose estimation from a monocular image sequence. Here, two cases are considered: (i) the image locations of the human joints are provided and (ii) the image locations of joints are unknown. In the former case, a novel approach is introduced that integrates a sparsity-driven 3D geometric prior and temporal smoothness. In the latter case, the former case is extended by treating the image locations of the joints as latent variables to take into account cons… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

5
403
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 377 publications
(408 citation statements)
references
References 57 publications
5
403
0
Order By: Relevance
“…Recent works have made significant advances in the frontier of skeleton-based 3D human pose estimation from single images, with many approaches achieving impressive results [21,23,29,33,35,45]. Although this line of work has boosted the interest for 3D human pose estimation, here we will focus our review on model-based pose estimation.…”
Section: Related Workmentioning
confidence: 99%
“…Recent works have made significant advances in the frontier of skeleton-based 3D human pose estimation from single images, with many approaches achieving impressive results [21,23,29,33,35,45]. Although this line of work has boosted the interest for 3D human pose estimation, here we will focus our review on model-based pose estimation.…”
Section: Related Workmentioning
confidence: 99%
“…Viewpoint parametersR view. Besides the fully supervised methods [25,26], several works have explored multi-view supervision [20,29,31], ordinal depth supervision [28], unpaired 2D-3D data [30,36,41,15] or videos [17] to alleviate the need for full 2D-3D annotations. While these auxiliary sources of supervision allow for compelling 3D predictions, in this work we use only inexpensive 2D keypoint labels.…”
Section: Factorization Networkmentioning
confidence: 99%
“…Learning-based discriminative methods, in particular deep learning methods Lifshitz et al 2016;Newell et al 2016;Tompson et al 2014], represent the current state of the art in 2D pose estimation, with some of these methods demonstrating real-time performance [Cao et al 2016;Wei et al 2016]. Monocular RGB estimation of the 3D skeletal pose is a much harder challenge tackled by relatively fewer methods [Bogo et al 2016;Tekin et al 2016b,c;Zhou et al , 2015b. Unfortunately, these methods are typically offline, and they often reconstruct 3D joint positions individually per image, which are temporally unstable, and do not enforce constant bone lengths.…”
Section: Introductionmentioning
confidence: 99%