Dual Networks Based 3D Multi-Person Pose Estimation From Monocular Video

Cheng, Yu; Wang, Bo; Tan, Robby T.

doi:10.1109/tpami.2022.3170353

Cited by 15 publications

(8 citation statements)

References 93 publications

(143 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[ [50][51][52] Weakly Supervised Learning These methods do not use exact 3D pose annotations; rather, they utilize less precise data like 2D joint locations or multi-view images. The model could be trained using these 2D joint annotations when direct 3D pose labels are not available.…”

Section: Paradigm Description Referencesmentioning

confidence: 99%

See 1 more Smart Citation

A Systematic Review of Recent Deep Learning Approaches for 3D Human Pose Estimation

El Kaid,

Baïna

2023

J. Imaging

View full text Add to dashboard Cite

Three-dimensional human pose estimation has made significant advancements through the integration of deep learning techniques. This survey provides a comprehensive review of recent 3D human pose estimation methods, with a focus on monocular images, videos, and multi-view cameras. Our approach stands out through a systematic literature review methodology, ensuring an up-to-date and meticulous overview. Unlike many existing surveys that categorize approaches based on learning paradigms, our survey offers a fresh perspective, delving deeper into the subject. For image-based approaches, we not only follow existing categorizations but also introduce and compare significant 2D models. Additionally, we provide a comparative analysis of these methods, enhancing the understanding of image-based pose estimation techniques. In the realm of video-based approaches, we categorize them based on the types of models used to capture inter-frame information. Furthermore, in the context of multi-person pose estimation, our survey uniquely differentiates between approaches focusing on relative poses and those addressing absolute poses. Our survey aims to serve as a pivotal resource for researchers, highlighting state-of-the-art deep learning strategies and identifying promising directions for future exploration in 3D human pose estimation.

show abstract

Section: Paradigm Description Referencesmentioning

confidence: 99%

“…In another hybrid approach introduced in a subsequent study [52], a fusion network is employed to blend top-down and bottom-up networks, enhancing the robustness of pose estimation from monocular videos. This fusion network unifies the 3D pose estimates to generate the final 3D poses.…”

Section: Fusion Approachesmentioning

confidence: 99%

A Systematic Review of Recent Deep Learning Approaches for 3D Human Pose Estimation

El Kaid,

Baïna

2023

J. Imaging

View full text Add to dashboard Cite

show abstract

“…Therefore, there is a need to enhance the keypoint quality through the adoption of advanced pose estimation methods. These newer methods, such as [169], [170] for 2D keypoints, and [171], [172] for 3D keypoints, offer significant advancements in terms of keypoint quality. A list of frequently updated 2D and 3D pose estimation methods can be found online at [173] and [174], respectively.…”

Section: ) Enhancing Pose Estimation Qualitymentioning

confidence: 99%

Advances in Skeleton-Based Fall Detection in RGB Videos: From Handcrafted to Deep Learning Approaches

Hoang,

Lee,

Piran

et al. 2023

IEEE Access

View full text Add to dashboard Cite

In the elderly population, falls are one of the leading causes of fatal and non-fatal injuries. Fall detection and early alarms play an important role in mitigating the negative effects of falls, especially given the growing proportion of the elderly population. Due to their non-intrusive nature, data availability, and low deployment costs, RGB videos have been used in many previous studies to detect falls. The RGB data, however, can be affected by background environment changes, resulting in non-recognition. To overcome these challenges, many researchers propose extracting skeleton data from RGB videos and using it for fall detection. Although there have been multiple surveys on fall detection, most of them focus on assessing fall detection systems using different kinds of sensors, and a comprehensive evaluation of skeleton-based fall detection in RGB videos is lacking. In this paper, we examine the most recent advances in skeleton-based fall detection in RGB videos, from handcrafted feature-based methods to advanced deep learning algorithms. Further, we present several skeleton-based fall detection techniques and their performance results on various benchmark datasets, along with challenges and future directions in this field.

show abstract

“…Monocular Multi-Person Reconstruction In contrast to the notable advancements in reconstructing the clothed human for an individual, limited emphasis has been placed on multi-person scenarios, which are evidently more applicable to our daily experiences. Most existing monocular works can only estimate the coarse body shapes of multiple people from monocular observations [4,7,14,19,21,24,25,38,39,46]. Mustafa et al [31] extend prior implicit methods to multiple people and recover spatially coherent 3D human shapes from an RGB image but mainly deal with cases where people are well-spaced and do not interact naturally in close range.…”

Section: Related Workmentioning

confidence: 99%