Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos

Cheng, Yu; Wang, Bo; Yang, Bo; Tan, Robby T.

doi:10.48550/arxiv.2012.11806

Cited by 1 publication

(2 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These blocks are effectively combined with temporal dependencies to address depth ambiguity and overcome self-occlusion. Likewise, Cheng et al [146] also presents a method that combines GCNs and TCNs but for 3D multi-person pose estimation in monocular videos. The framework introduces two types of GCNs: a human-joint GCN and a human-bone GCN.…”

Section: Methods Based On Gcnsmentioning

confidence: 99%

“…Similar to the above methods for depth estimation, HMORs (hierarchical multi-person ordinal relations) [145] employs an integrated top-down model to estimate human bounding boxes, depths, and root-relative 3D poses simultaneously, with a coarse-to-fine architecture that, instead of using image features as the above methods for depth estimation, hierarchically estimates multi-person ordinal relations of depths and angles which captures body-part and joint-level semantics while maintaining global consistency to improve the accuracy of depth estimation. The framework proposed for 3D multi-person pose estimation in [146] combines GCNs and TCNs to estimate camera-centric poses without requiring camera parameters. It includes GCNs that estimate frame-wise 3D poses and TCNs that enforce temporal and human dynamics constraints to estimate person-centric with a joint-TCN and camera-centric 3D poses across frames with a root-TCN.…”

Section: Top-down Approachesmentioning

confidence: 99%

See 1 more Smart Citation

A Systematic Review of Recent Deep Learning Approaches for 3D Human Pose Estimation

El Kaid,

Baïna

2023

J. Imaging

View full text Add to dashboard Cite

Three-dimensional human pose estimation has made significant advancements through the integration of deep learning techniques. This survey provides a comprehensive review of recent 3D human pose estimation methods, with a focus on monocular images, videos, and multi-view cameras. Our approach stands out through a systematic literature review methodology, ensuring an up-to-date and meticulous overview. Unlike many existing surveys that categorize approaches based on learning paradigms, our survey offers a fresh perspective, delving deeper into the subject. For image-based approaches, we not only follow existing categorizations but also introduce and compare significant 2D models. Additionally, we provide a comparative analysis of these methods, enhancing the understanding of image-based pose estimation techniques. In the realm of video-based approaches, we categorize them based on the types of models used to capture inter-frame information. Furthermore, in the context of multi-person pose estimation, our survey uniquely differentiates between approaches focusing on relative poses and those addressing absolute poses. Our survey aims to serve as a pivotal resource for researchers, highlighting state-of-the-art deep learning strategies and identifying promising directions for future exploration in 3D human pose estimation.

show abstract

Section: Methods Based On Gcnsmentioning

confidence: 99%

Section: Top-down Approachesmentioning

confidence: 99%