2022
DOI: 10.48550/arxiv.2207.10971
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Human Kinematics by Modeling Temporal Correlations between Joints for Video-based Human Pose Estimation

Abstract: Estimating human poses from videos is critical in human-computer interaction. By precisely estimating human poses, the robot can provide an appropriate response to the human. Most existing approaches use the optical flow, RNNs, or CNNs to extract temporal features from videos. Despite the positive results of these attempts, most of them only straightforwardly integrate features along the temporal dimension, ignoring temporal correlations between joints. In contrast to previous methods, we propose a plug-and-pl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 38 publications
(83 reference statements)
0
1
0
Order By: Relevance
“…approach consists of two stages. First, utilize off-the-shelf 2D pose estimators (Sun et al 2019;Newell, Yang, and Deng 2016;Dang et al 2022) to estimate the 2D pose from the image and then regress the 3D pose from the obtained 2D human pose. Compared to direct estimation, this cascaded approach has the following advantages: 2D estimator is trained on more diverse and extensive 2D human pose datasets, which enables stronger visual perception and generalization ability (Martinez et al 2017).…”
Section: Introductionmentioning
confidence: 99%
“…approach consists of two stages. First, utilize off-the-shelf 2D pose estimators (Sun et al 2019;Newell, Yang, and Deng 2016;Dang et al 2022) to estimate the 2D pose from the image and then regress the 3D pose from the obtained 2D human pose. Compared to direct estimation, this cascaded approach has the following advantages: 2D estimator is trained on more diverse and extensive 2D human pose datasets, which enables stronger visual perception and generalization ability (Martinez et al 2017).…”
Section: Introductionmentioning
confidence: 99%