2019
DOI: 10.48550/arxiv.1912.05656
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

VIBE: Video Inference for Human Body Pose and Shape Estimation

Abstract: Figure 1: Given challenging in-the-wild videos, a recent state-of-the-art video-pose-estimation approach [30] (top), fails to produce accurate and kinematically plausible 3D body shapes and poses. To address this, we exploit a large-scale motioncapture dataset to train a motion discriminator model in a GAN style. Our VIBE model (bottom) is able to produce realistic and kinematically plausible body meshes outperforming previous work.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(18 citation statements)
references
References 45 publications
0
13
0
Order By: Relevance
“…We are particularly interested in integrating the renderer with systems that infer SMPL bodies from images (e.g. [22,28,27]) to enable an end-to-end system for body image generation trained from images in the wild.…”
Section: Discussionmentioning
confidence: 99%
“…We are particularly interested in integrating the renderer with systems that infer SMPL bodies from images (e.g. [22,28,27]) to enable an end-to-end system for body image generation trained from images in the wild.…”
Section: Discussionmentioning
confidence: 99%
“…The weight for this term is set to be 50× the weight of the prior, as we expect a lower variance for pose dynamics. We compare our method with the recent work of [18], showing the results in table 3. As in the static setting, these are also the lowest errors reported so far.…”
Section: Methodsmentioning
confidence: 99%
“…A step is considered valid only if the heel of one foot touches the toe of another foot. Then subject's 3D body key-points were extracted using VIBE system [9]. VIBE (Video Inference for Body Pose and Shape Estimation) is a video pose and shape estimation method that predicts the parameters of SMPL body model for each frame of an input video.…”
Section: Dataset Descriptionmentioning
confidence: 99%
“…In this work (Figure 2), the focus of the automated assessment system is on the "Tandem gait" task which is part of a larger system called ATEC-Activated Test of Embodied Cognition [1,3,17]. A dataset has been created with 27 children performing the gait In order to automatically evaluate subject's performance, first VIBE [9] human pose estimation system was used to extract 3D body key-points. Then a deep learning based model was trained to classify subject's steps as valid or invalid.…”
mentioning
confidence: 99%