2022
DOI: 10.1145/3524497
|View full text |Cite
|
Sign up to set email alerts
|

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Abstract: Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. Recently, benefiting from the deep learning technologies, a significant amount of research efforts have advanced the monocular human pose estimation both in 2D and 3D areas. Although there have been some works to summarize different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work from 2D to 3D. In this pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0
1

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 82 publications
(26 citation statements)
references
References 192 publications
0
25
0
1
Order By: Relevance
“…After the pose estimation and bounding box prediction, we obtain the necessary prior information to be used in monocular tracking and inter-camera ReID. There are many popular methods for monocular pedestrian tracking that design ingenious structures of neural networks to deal with occlusions or some other challenges in the video, pursuing high monocular tracking performance [ 3 , 5 ].…”
Section: Methods Frameworkmentioning
confidence: 99%
See 1 more Smart Citation
“…After the pose estimation and bounding box prediction, we obtain the necessary prior information to be used in monocular tracking and inter-camera ReID. There are many popular methods for monocular pedestrian tracking that design ingenious structures of neural networks to deal with occlusions or some other challenges in the video, pursuing high monocular tracking performance [ 3 , 5 ].…”
Section: Methods Frameworkmentioning
confidence: 99%
“…To estimate the locations and movements of staff, we need to track everyone in the room and detect their poses [ 3 , 4 , 5 , 6 ]. Hassaballah et al [ 7 ] proposed a robust vehicle detection and tracking approach using a multi-scale deep convolution neural network.…”
Section: Introductionmentioning
confidence: 99%
“…Layout GAN. In order to represent the human pose, we first use the human keypoint detection method [3,24,28,29] to predict the 2-D keypoint 𝐾 ∈ R 2×25 of the human body. Then according to the predefined connection strategy, we can get the pose connection map 𝑃 𝑡 𝑑 ∈ R 3×𝐻 ×𝑊 of the driving person, where 𝐻 × 𝑊 is the resolution of the image.…”
Section: Region Generation Networkmentioning
confidence: 99%
“…Detecting human poses, for example, is critical for intelligent robots to adjust their actions and allocate their attention properly when interacting with humans [1]. Moreover, HPE has been widely exploited for many computer vision tasks, such as action recognition [2], [3], virtual reality [4], and human-computer interaction [5]- [7]. According to the data source type, HPE can be roughly divided into two scopes, i.e., image-based HPE and video-based HPE.…”
Section: Introductionmentioning
confidence: 99%