2018
DOI: 10.1007/978-3-030-01249-6_48
|View full text |Cite
|
Sign up to set email alerts
|

Deep Autoencoder for Combined Human Pose Estimation and Body Model Upscaling

Abstract: We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views. We train a symmetric convolutional autoencoder with a dual loss that enforces learning of a latent representation that encodes skeletal joint positions, and at the same time learns a deep representation of volumetric body shape. We harness the latter to up-scale input volumetric data by a factor of 4×, whilst recovering a 3D estimate of joint positions with equal or greater accuracy t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
37
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 54 publications
(39 citation statements)
references
References 48 publications
(61 reference statements)
0
37
0
Order By: Relevance
“…3.5, Tri CPM LSTM. As one can see from Table 2, our proposed approach outperforms all compared methods at time of publication [the newer works of Martinez et al (2017) and Trumble et al (2018)] indicate the speed of improvement in field of 3D pose estimation) despite excluding the fusion with the kinematic based IMU, with the mean error reduced by 15% compared with (Tome et al 2017), the Tri CPM LSTM approach and our previous method . While compared to the state of the art results by Mude Lin Liang Lin and Cheng (2017), many activities have a similar error around 5 or 6cm.…”
Section: Human 36mmentioning
confidence: 86%
See 1 more Smart Citation
“…3.5, Tri CPM LSTM. As one can see from Table 2, our proposed approach outperforms all compared methods at time of publication [the newer works of Martinez et al (2017) and Trumble et al (2018)] indicate the speed of improvement in field of 3D pose estimation) despite excluding the fusion with the kinematic based IMU, with the mean error reduced by 15% compared with (Tome et al 2017), the Tri CPM LSTM approach and our previous method . While compared to the state of the art results by Mude Lin Liang Lin and Cheng (2017), many activities have a similar error around 5 or 6cm.…”
Section: Human 36mmentioning
confidence: 86%
“…Given these restrictions, we propose a new dataset to address these short comings, TotalCapture. 1 It contains a large amount of MVV, and synchronised IMU Table 2 A comparison of our approach to other works on the Human 3.6M dataset, multiview indicates whether the approach uses multiple camera views [the works of Martinez et al (2017) and Trumble et al (2018) and Vicon labelling for ground truth. It was captured indoors in a volume measuring roughly 8 × 4 m with 8 calibrated HD video cameras at 60 Hz.…”
Section: Total Capturementioning
confidence: 99%
“…Most approaches take a two-stage approach: first obtaining a singleview 3D reconstruction and then post-processing the result to be smooth via solving a constrained optimization problem [66,58,46,47,27,38,43]. Recent methods obtain accurate shapes and textures of clothing by pre-capturing the actors and making use of silhouettes [50,60,24,4]. While these approaches obtain far more accurate shape, reliance on the pre-scan and silhouettes restricts these approaches to videos obtained in an interactive and controlled environments.…”
Section: Related Workmentioning
confidence: 99%
“…Studies have shown that human motion includes a lot of redundant information and can be approximately presented by dimensions lower than the original degree of freedom of human motion [ 6 , 17 , 18 ], which inspired the study of sparse sensor motion capture. Some researchers developed motion-capture technologies combining sparse inertial sensors with video input [ 7 , 19 ] or optical reflective markers [ 20 ] or using only sparse optical markers [ 21 ].…”
Section: Related Workmentioning
confidence: 99%