2016 Fourth International Conference on 3D Vision (3DV) 2016
DOI: 10.1109/3dv.2016.58
|View full text |Cite
|
Sign up to set email alerts
|

Synthesizing Training Images for Boosting Human 3D Pose Estimation

Abstract: Human 3D pose estimation from a single image is a challenging task with numerous applications. Convolutional Neural Networks (CNNs) have recently achieved superior performance on the task of 2D pose estimation from a single image, by training on images with 2D annotations collected by crowd sourcing. This suggests that similar success could be achieved for direct estimation of 3D poses. However, 3D poses are much harder to annotate, and the lack of suitable annotated training images hinders attempts towards en… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
217
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 255 publications
(219 citation statements)
references
References 52 publications
0
217
0
Order By: Relevance
“…Even the most recent annotated real 3D pose data sets, or combined real/synthetic data sets [Chen et al 2016;Ionescu et al 2014b;Mehta et al 2016] are a subset of real world human pose, shape, appearance and background distributions. Recent top performing methods explicitly address this data sparsity by training similarly deep networks, but with architectural changes enabling improved intermediate training supervision [Mehta et al 2016].…”
Section: Discussionmentioning
confidence: 99%
“…Even the most recent annotated real 3D pose data sets, or combined real/synthetic data sets [Chen et al 2016;Ionescu et al 2014b;Mehta et al 2016] are a subset of real world human pose, shape, appearance and background distributions. Recent top performing methods explicitly address this data sparsity by training similarly deep networks, but with architectural changes enabling improved intermediate training supervision [Mehta et al 2016].…”
Section: Discussionmentioning
confidence: 99%
“…Recent works have shown the success of deep network architectures for the problem of retrieving 3D features such as kinematic joints [4,33] or surface characterizations [43] from single images, with extremely encouraging results. Such successes, sometimes achieved with simple, standard network architectures [30], have naturally motivated Figure 1.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, researchers have also investigated the use of computer graphics to synthesize large datasets for applications requiring a different set of annotations, e.g. human 3D pose estimation [12], [13] or 3D character manipulation [14]. The key element with synthetic images is that annotations come at no cost since these are generated during rendering.…”
Section: A Motivationsmentioning
confidence: 99%