2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.586
|View full text |Cite
|
Sign up to set email alerts
|

3D Menagerie: Modeling the 3D Shape and Pose of Animals

Abstract: Figure 1: Animals from images. We learn an articulated, 3D, statistical shape model of animals using very little training data. We fit the shape and pose of the model to 2D image cues showing how it generalizes to previously unseen shapes. AbstractThere has been significant work on learning realistic, articulated, 3D models of the human body. In contrast, there are few such models of animals, despite many applications. The main challenge is that animals are much less cooperative than humans. The best human bod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
226
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 303 publications
(244 citation statements)
references
References 26 publications
0
226
0
Order By: Relevance
“…However, state-of-the-art performance currently requires body-scanning of many subjects to make body models. Typically, large datasets are collected to enable the creation of robust algorithms for inference on diverse humans (or for animals, scanning toy models has been fruitful (44)). Recently, outstanding improvements have been made to capture shapes of animals from images (22,45,46).…”
Section: Dense-representations Of Bodiesmentioning
confidence: 99%
See 1 more Smart Citation
“…However, state-of-the-art performance currently requires body-scanning of many subjects to make body models. Typically, large datasets are collected to enable the creation of robust algorithms for inference on diverse humans (or for animals, scanning toy models has been fruitful (44)). Recently, outstanding improvements have been made to capture shapes of animals from images (22,45,46).…”
Section: Dense-representations Of Bodiesmentioning
confidence: 99%
“…This is a difficult challenge as zebras are designed to blend into the background in the safari. This paper makes significant improvements on accuracy and realism, and builds on a line of elegant work from these authors (44,46) [6**] DeeperCut: A deeper, stronger, and faster multiperson pose estimation model (29) DeeperCut is a highly accurate algorithm for multi-human pose estimation due to improved deep learning based body part detectors, and image-conditioned pairwise terms to predict the location of body parts based on the location of other body parts. These terms are then used to find accurate poses of individuals via graph cutting.…”
Section: Outlook and Conclusionmentioning
confidence: 99%
“…For comparison, we also perform per-instance optimization over the model variables (D). [34], which requires ground truth keypoints and segmentations; (B) We run the network feed-forward prediction on a synthetic dataset; (C) our proposed method; (D) per-instance optimization on model variables rather than network features; (F) feed-forward prediction (no optimization); (G) feed-forward prediction without texture; (H) feed-forward prediction with noise on the bounding boxes.…”
Section: Methodsmentioning
confidence: 99%
“…The SMAL model is a function M (β, θ, γ) of shape β, pose θ and translation γ. β is a vector of the coefficients of the learned PCA shape space, θ ∈ R 3N = {r i } N i=1 is the relative rotation, expressed with Rodrigues vectors, of the joints in the kinematic tree, and γ is the global translation applied to the root joint. Unlike [34], which uses N = 33 joints, we segment and add articulation to the ears, obtaining a model with N = 35 body parts. The SMAL function returns a 3D mesh, where the model template is shaped by β, articulated by θ and shifted by γ.…”
Section: Smal Modelmentioning
confidence: 99%
“…For our experiments, we consider four datasets of meshes: shapes computed from the MNIST dataset [37], the MPI Dyna dataset of human shapes [49], a dataset of animal shapes from the Skinned Multi-Animal Linear model (SMAL) [65], and a dataset of human shapes from the Skinned Multi-Person Linear model (SMPL) [40] via the SURREAL dataset [60]. For each, we generate point clouds of size N T via area-weighted sampling.…”
Section: Methodsmentioning
confidence: 99%