2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00546
|View full text |Cite
|
Sign up to set email alerts
|

Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture From Images “In the Wild”

Abstract: Figure 1: Zebras from images. We automatically extract 3D textured models of zebras from in-the-wild images. We regress directly from pixels, without keypoint detection or segmentation. AbstractWe present the first method to perform automatic 3D pose, shape and texture capture of animals from images acquired in-the-wild. In particular, we focus on the problem of capturing 3D information about Grevy's zebras from a collection of images. The Grevy's zebra is one of the most endangered species in Africa, with onl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
106
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 136 publications
(114 citation statements)
references
References 29 publications
0
106
0
Order By: Relevance
“…Typically, large datasets are collected to enable the creation of robust algorithms for inference on diverse humans (or for animals, scanning toy models has been fruitful (44)). Recently, outstanding improvements have been made to capture shapes of animals from images (22,45,46). However, there are no animal-specific toolboxes geared towards neuroscience applications, although we believe that this will change in the near future, as for many applications having the soft-tissue measured will be highly important, i.e.…”
Section: Dense-representations Of Bodiesmentioning
confidence: 99%
“…Typically, large datasets are collected to enable the creation of robust algorithms for inference on diverse humans (or for animals, scanning toy models has been fruitful (44)). Recently, outstanding improvements have been made to capture shapes of animals from images (22,45,46). However, there are no animal-specific toolboxes geared towards neuroscience applications, although we believe that this will change in the near future, as for many applications having the soft-tissue measured will be highly important, i.e.…”
Section: Dense-representations Of Bodiesmentioning
confidence: 99%
“…Future advances will likely allow for the calibration and synchronizaton of imaging devices across multiple UAVs (e.g., Price et al, 2018; Saini et al, 2019). This would make it possible to measure the full 3-D posture of wild animals (e.g., Zuffi et al, 2019) in scenarios where fixed camera systems (e.g., Nath et al, 2019) would not be tractable, such as during migratory or predation events. When combined, these technologies could allow researchers to address questions about the behavioral ecology of animals that were previously impossible to answer.…”
Section: Discussionmentioning
confidence: 99%
“…New pose estimation methods are already replacing human annotations with fully articulated volumetric 3-D models of the animal’s body (e.g., the SMAL model from Zuffi et al, 2017 or the SMALST model from Zuffi et al, 2019), and the 3-D scene can be estimated using unsupervised, semi-supervised, or weakly-supervised methods (e.g., Jaques et al, 2019; Zuffi et al, 2019), where the shape, position, and posture of the animal’s body, the camera position and lens parameters, and the background environment and lighting conditions are jointly learned directly from 2-D images by a deep-learning model (Valentin et al, 2019; Zuffi et al, 2019). These inverse graphics models (Kulkarni et al, 2015; Sabour et al, 2017; Valentin et al, 2019) take advantage of recently developed differentiable graphics engines that allow 3-D rendering parameters to be controlled using standard optimization methods (Zuffi et al, 2019; Valentin et al, 2019). After optimization, the volumetric 3-D timeseries data predicted by the deep learning model could be used directly for behavioral analysis or specific keypoints or body parts could be selected for analysis post-hoc.…”
Section: Discussionmentioning
confidence: 99%
“…The performance evaluation of the network trained with the present dataset revealed that there is still room for improvement regarding the misattribution of the limb keypoints ( Figure 2h, Table 2), although the RMSE indicates the human-level performance (Figure 3). The DeepLabCut algorithm (Mathis et al, 2018) used in the present evaluation does not explicitly utilize the prior knowledge about the animal's body, whereas the other algorithms were suggested to use the connection between keypoints (Insafutdinov et al, 2016;Cao et al, 2017) or 3D shape of the subject (Biggs et al, 2018;Zuffi et al, 2019). Such utilization of the prior knowledge may help to improve the estimation.…”
Section: Discussionmentioning
confidence: 99%