Panoptic Studio: A Massively Multiview System for Social Interaction Capture

Joo, Hanbyul; Simon, Tomas; Li, Xulong; Líu, Hao; Tan, Lei; Gui, Lin; Banerjee, Sean; Godisart, Timothy; Nabbe, Bart; Matthews, Iain; Kanade, Takeo; Nobuhara, Shohei; Sheikh, Yaser

doi:10.1109/tpami.2017.2782743

Cited by 259 publications

(252 citation statements)

References 50 publications

Supporting

Mentioning

246

Contrasting

Unclassified

Order By: Relevance

“…We generate the ground truth depth maps from the point cloud with the screened Poisson surface reconstruction method [15]. We choose scenes: 1,4,9,10,11,12,13,15,23,24,29,32,33,34,48,49,62,75,77,110,114,118 as the testing set and the other scenes as training set. The RGBD, SUN3D, MVS and Scenes11 datasets contain more than 30000 different scenes in total, which are very different from the DTU dataset.…”

Section: Implementation Detailsmentioning

confidence: 99%

MVS2: Deep Unsupervised Multi-View Stereo with Multi-View Symmetry

Dai

Zhang

Rao

et al. 2019

2019 International Conference on 3D Vision (3DV)

View full text Add to dashboard Cite

The success of existing deep-learning based multi-view stereo (MVS) approaches greatly depends on the availability of large-scale supervision in the form of dense depth maps. Such supervision, while not always possible, tends to hinder the generalization ability of the learned models in never-seen-before scenarios. In this paper, we propose the first unsupervised learning based MVS network, which learns the multi-view depth maps from the input multi-view images and does not need ground-truth 3D training data. Our network is symmetric in predicting depth maps for all views simultaneously, where we enforce cross-view consistency of multi-view depth maps during both training and testing stages. Thus, the learned multi-view depth maps naturally comply with the underlying 3D scene geometry. Besides, our network also learns the multi-view occlusion maps, which further improves the robustness of our network in handling real-world occlusions. Experimental results on multiple benchmarking datasets demonstrate the effectiveness of our network and the excellent generalization ability.

show abstract

Section: Implementation Detailsmentioning

confidence: 99%

MVS2: Deep Unsupervised Multi-View Stereo with Multi-View Symmetry

Dai

Zhang

Rao

et al. 2019

2019 International Conference on 3D Vision (3DV)

View full text Add to dashboard Cite

show abstract

“…We project the 3D human poses of different HOIs into 2D poses with random camera poses. (ii) The dataset proposed and collected by [19], which also contains 3D poses of multiple persons in social interactions. We project 3D poses into 2D following the same method as in (i).…”

Section: Concurrent Action Detectionmentioning

confidence: 99%

Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense

Chen

Huang

Tao

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We propose a new 3D holistic ++ scene understanding problem, which jointly tackles two tasks from a single-view image: (i) holistic scene parsing and reconstruction-3D estimations of object bounding boxes, camera pose, and room layout, and (ii) 3D human pose estimation. The intuition behind is to leverage the coupled nature of these two tasks to improve the granularity and performance of scene understanding. We propose to exploit two critical and essential connections between these two tasks: (i) human-object interaction (HOI) to model the fine-grained relations between agents and objects in the scene, and (ii) physical commonsense to model the physical plausibility of the reconstructed scene. The optimal configuration of the 3D scene, represented by a parse graph, is inferred using Markov chain Monte Carlo (MCMC), which efficiently traverses through the non-differentiable joint solution space. Experimental results demonstrate that the proposed algorithm significantly improves the performance of the two tasks on three datasets, showing an improved generalization ability.

show abstract

“…The vector ↔ L n,: X * in Eq. (20), lies on a local motion plane formed by X * n,: and it's two neighboring points. Similarly,each row in LX * will also be a vector on a local motion plane.…”

Section: Structure Reconstruction Accuracymentioning

confidence: 99%

“…Average running time for minimizing either X or W are smaller due to the sparsity of W. Total number of iterations depends on initialization quality, reported experiments ran an average of 62.26 iterations. Results on Dancing and Toddler[20]. Disjoint Dancing segments form an input datum.…”

mentioning

confidence: 99%

Discrete Laplace Operator Estimation for Dynamic 3D Reconstruction

Dunn

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We present a general paradigm for dynamic 3D reconstruction from multiple independent and uncontrolled image sources having arbitrary temporal sampling density and distribution. Our graph-theoretic formulation models the spatio-temporal relationships among our observations in terms of the joint estimation of their 3D geometry and its discrete Laplace operator. Towards this end, we define a tri-convex optimization framework that leverages the geometric properties and dependencies found among a Euclidean shape-space and the discrete Laplace operator describing its local and global topology. We present a reconstructability analysis, experiments on motion capture data and multi-view image datasets, as well as explore applications to geometrybased event segmentation and data association.

show abstract

Panoptic Studio: A Massively Multiview System for Social Interaction Capture

Cited by 259 publications

References 50 publications

MVS2: Deep Unsupervised Multi-View Stereo with Multi-View Symmetry

MVS2: Deep Unsupervised Multi-View Stereo with Multi-View Symmetry

Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense

Discrete Laplace Operator Estimation for Dynamic 3D Reconstruction

Contact Info

Product

Resources

About