2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01274
|View full text |Cite
|
Sign up to set email alerts
|

Single-Stage is Enough: Multi-Person Absolute 3D Pose Estimation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 27 publications
0
9
0
Order By: Relevance
“…The 3D temporal assignment problem was also tackled by the Hungarian matching approach for videobased multi-person 3D HPE, which achieved impressive results in [23], [24]. L Jin et al introduced a single-stage method that integrates human detection and pose estimation, simplifying the process and enhancing efficiency by directly estimating 3D poses from detected individuals in a single network pass, demonstrating significant improvements over traditional multi-stage methods [35].…”
Section: ) Multi-person 3d Hpementioning
confidence: 99%
“…The 3D temporal assignment problem was also tackled by the Hungarian matching approach for videobased multi-person 3D HPE, which achieved impressive results in [23], [24]. L Jin et al introduced a single-stage method that integrates human detection and pose estimation, simplifying the process and enhancing efficiency by directly estimating 3D poses from detected individuals in a single network pass, demonstrating significant improvements over traditional multi-stage methods [35].…”
Section: ) Multi-person 3d Hpementioning
confidence: 99%
“…More recent studies have further refined such methods. For example, [7] proposed a deep regression network to estimate absolute 3D coordinates directly from images without intermediate 2D pose representations, and [8] improved the accuracy of poste estimations using a keypoint coordinate representation based on camera line-of-sight. Unlike these approaches, which estimate 3D poses from a single image, methods using multiple images of video frames have also been studied; for example, the authors of [9] used temporal convolutional networks (TCNs) to obtain absolute 3D human poses.…”
Section: Related Workmentioning
confidence: 99%
“…As 2D pose estimation is maturing, most top-down methods focus on how to regress the depth coordinates of human joints. While bottomup approaches [6,10,40,41] encode the human pose into a representation that does not depend on the number of people. Reference [40] encodes the 3D pose into a 3D volume and refines it using fine-tuned structures.…”
Section: Multi-person 3d Pose Estimationmentioning
confidence: 99%
“…By using depth-aware part association, [41] can reconstruct 3D poses with the constraint of adaptive bone length and ordinal prior. Reference [10] regresses the 2D pose based on the centre point, root-point depth, and root-relative depth. Then it outputs the 3D pose through a single-stage CNN.…”
Section: Multi-person 3d Pose Estimationmentioning
confidence: 99%