2020
DOI: 10.48550/arxiv.2004.11822
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

3D Human Pose Estimation using Spatio-Temporal Networks with Explicit Occlusion Training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 34 publications
0
9
0
Order By: Relevance
“…Additionally, when using GTs, LiftFormer is still better than other end-to-end models [8,7,46] with 42.9, 40.1 and 39.9 mm MPJPE, respectively, which not only leverage temporal data, but also features extracted from the original RGB images themselves, optical flow or occlusion enhanced heatmaps. Our model also outperforms other SMPL-based approaches like SPIN [20] or ENAS [33], with 41.1mm and 42.4mm MPJPE, respectively, and multi-view methods, like DeepFuse [16] with 37.5mm MPJPE.…”
Section: Comparison With State-of-the-artmentioning
confidence: 96%
See 3 more Smart Citations
“…Additionally, when using GTs, LiftFormer is still better than other end-to-end models [8,7,46] with 42.9, 40.1 and 39.9 mm MPJPE, respectively, which not only leverage temporal data, but also features extracted from the original RGB images themselves, optical flow or occlusion enhanced heatmaps. Our model also outperforms other SMPL-based approaches like SPIN [20] or ENAS [33], with 41.1mm and 42.4mm MPJPE, respectively, and multi-view methods, like DeepFuse [16] with 37.5mm MPJPE.…”
Section: Comparison With State-of-the-artmentioning
confidence: 96%
“…Cheng et al [7] improves the state-of-the-art by using a discriminator to assess if the generated poses are valid. Specifically, they use the Kinematic Chain Space (KCS) model, defined in [42], and expand it temporally (TKCS).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…To make this approach applicable to personalized gesture-based retrieval systems, it can be extended to monocular video captured by accessible devices such as a mobile phone camera. This approach would be feasible due to recent progress in the area of 3D human pose estimation in predicting the body joint coordinates from a monocular video [37][38][39]. This would then allow future recommendation systems to take embodied processes into account, resulting in better and more responsive personalized experiences.…”
Section: Root Torsomentioning
confidence: 99%