2023
DOI: 10.1109/tpami.2022.3170353
|View full text |Cite
|
Sign up to set email alerts
|

Dual Networks Based 3D Multi-Person Pose Estimation From Monocular Video

Abstract: Monocular 3D human pose estimation has made progress in recent years. Most of the methods focus on single persons, which estimate the poses in the person-centric coordinates, i.e., the coordinates based on the center of the target person. Hence, these methods are inapplicable for multi-person 3D pose estimation, where the absolute coordinates (e.g., the camera coordinates) are required. Moreover, multi-person pose estimation is more challenging than single pose estimation, due to inter-person occlusion and clo… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 93 publications
(143 reference statements)
0
8
0
Order By: Relevance
“…[ [50][51][52] Weakly Supervised Learning These methods do not use exact 3D pose annotations; rather, they utilize less precise data like 2D joint locations or multi-view images. The model could be trained using these 2D joint annotations when direct 3D pose labels are not available.…”
Section: Paradigm Description Referencesmentioning
confidence: 99%
See 1 more Smart Citation
“…[ [50][51][52] Weakly Supervised Learning These methods do not use exact 3D pose annotations; rather, they utilize less precise data like 2D joint locations or multi-view images. The model could be trained using these 2D joint annotations when direct 3D pose labels are not available.…”
Section: Paradigm Description Referencesmentioning
confidence: 99%
“…In another hybrid approach introduced in a subsequent study [52], a fusion network is employed to blend top-down and bottom-up networks, enhancing the robustness of pose estimation from monocular videos. This fusion network unifies the 3D pose estimates to generate the final 3D poses.…”
Section: Fusion Approachesmentioning
confidence: 99%
“…Therefore, there is a need to enhance the keypoint quality through the adoption of advanced pose estimation methods. These newer methods, such as [169], [170] for 2D keypoints, and [171], [172] for 3D keypoints, offer significant advancements in terms of keypoint quality. A list of frequently updated 2D and 3D pose estimation methods can be found online at [173] and [174], respectively.…”
Section: ) Enhancing Pose Estimation Qualitymentioning
confidence: 99%
“…Monocular Multi-Person Reconstruction In contrast to the notable advancements in reconstructing the clothed human for an individual, limited emphasis has been placed on multi-person scenarios, which are evidently more applicable to our daily experiences. Most existing monocular works can only estimate the coarse body shapes of multiple people from monocular observations [4,7,14,19,21,24,25,38,39,46]. Mustafa et al [31] extend prior implicit methods to multiple people and recover spatially coherent 3D human shapes from an RGB image but mainly deal with cases where people are well-spaced and do not interact naturally in close range.…”
Section: Related Workmentioning
confidence: 99%