Accelerating 3D Deep Learning with PyTorch3D

Ravi, Nikhila; Reizenstein, Jeremy; Novotný, David; Taylor, G. N.; Lo, Wan-Yen; Johnson, Justin C.; Gkioxari, Georgia

doi:10.48550/arxiv.2007.08501

Cited by 181 publications

(281 citation statements)

References 44 publications

(90 reference statements)

Supporting

Mentioning

268

Contrasting

Order By: Relevance

“…Note that we also normalize the mesh to the unit size before rendering. We use Py-Torch3D [36] as our renderer. The textured mesh densely fills in pixels and removes hidden surfaces.…”

Section: Final Rendering and Postprocessingmentioning

confidence: 99%

GeoFill: Reference-Based Image Inpainting of Scenes with Complex Geometry

Zhao¹,

Barnes²,

Zhou³

et al. 2022

Preprint

View full text Add to dashboard Cite

Reference-guided image inpainting restores image pixels by leveraging the content from another reference image. The previous state-of-the-art, TransFill, warps the source image with multiple homographies, and fuses them together for hole filling. Inspired by structure from motion pipelines and recent progress in monocular depth estimation, we propose a more principled approach that does not require heuristic planar assumptions. We leverage a monocular depth estimate and predict relative pose between cameras, then align the reference image to the target by a differentiable 3D reprojection and a joint optimization of relative pose and depth map scale and offset. Our approach achieves state-of-the-art performance on both RealEstate10K and MannequinChallenge dataset with large baselines, complex geometry and extreme camera motions. We experimentally verify our approach is also better at handling large holes.

show abstract

“…Note that we also normalize the mesh to the unit size before rendering. We use Py-Torch3D [36] as our renderer. The textured mesh densely fills in pixels and removes hidden surfaces.…”

Section: Final Rendering and Postprocessingmentioning

confidence: 99%

GeoFill: Reference-Based Image Inpainting of Scenes with Complex Geometry

Zhao¹,

Barnes²,

Zhou³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Because the rendering operation is normally discrete, it does not provide usable error gradients for optimization. A variety of mesh [10,33,29,24,31,8,43], point-cloud, and implicit [32,40,39,48] based differentiable renderers have been proposed. [24] developed an approximation of gradient for rasterization.…”

Section: Related Workmentioning

confidence: 99%

SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators

Luo¹,

Li²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent advances in deep generative models have led to immense progress in 3D shape synthesis. While existing models are able to synthesize shapes represented as voxels, point-clouds, or implicit functions, these methods only indirectly enforce the plausibility of the final 3D shape surface.Here we present a 3D shape synthesis framework (SurfGen) that directly applies adversarial training to the object surface. Our approach uses a differentiable spherical projection layer to capture and represent the explicit zero isosurface of an implicit 3D generator as functions defined on the unit sphere. By processing the spherical representation of 3D object surfaces with a spherical CNN in an adversarial setting, our generator can better learn the statistics of natural shape surfaces. We evaluate our model on largescale shape datasets, and demonstrate that the end-to-end trained model is capable of generating high fidelity 3D shapes with diverse topology. Code is available at https: //github.com/aluo-x/NeuralRaycaster.

show abstract

“…Modern techniques learn disparity from image pairs [30], estimate correspondences with contrastive learning [55], perform multi-view stereopsis via differentiable ray projection [28] or learn to reconstruct scenes while optimizing for cameras [26,48]. Differentiable rendering [10,29,36,38,41,47,50] allows gradients to flow to 3D scenes via 2D re-projections. [10,29,38,50] reconstruct single objects from a single view via rendering from 2 or more views during training.…”

Section: Related Workmentioning

confidence: 99%

“…In ad-dition to numerous semantic-specific details, recognition in novel viewpoints via direct appearance synthesis is suboptimal: one may be sure of the presence of a rug behind a couch, but unsure of its particular color. Similarly, there have been advances in learning to infer 3D properties of scenes from image cues [20,46,63], or with differentiable rendering [10,29,38,50] and other methods for bypassing the need for direct 3D supervision [27,33,34,68]. However, these approaches do not connect to complex scene semantics; they primarily focus on single objects or small, less diverse 3D annotated datasets.…”

Section: Introductionmentioning

confidence: 99%

Recognizing Scenes from Novel Viewpoints

Qian¹,

Kirillov²,

Ravi³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Humans can perceive scenes in 3D from a handful of 2D views. For AI agents, the ability to recognize a scene from any viewpoint given only a few images enables them to efficiently interact with the scene and its objects. In this work, we attempt to endow machines with this ability. We propose a model which takes as input a few RGB images of a new scene and recognizes the scene from novel viewpoints by segmenting it into semantic categories. All this without access to the RGB images from those views. We pair 2D scene recognition with an implicit 3D representation and learn from multi-view 2D annotations of hundreds of scenes without any 3D supervision beyond camera poses. We experiment on challenging datasets and demonstrate our model's ability to jointly capture semantics and geometry of novel scenes with diverse layouts, object types and shapes. 1

show abstract

Accelerating 3D Deep Learning with PyTorch3D

Cited by 181 publications

References 44 publications

GeoFill: Reference-Based Image Inpainting of Scenes with Complex Geometry

GeoFill: Reference-Based Image Inpainting of Scenes with Complex Geometry

SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators

Recognizing Scenes from Novel Viewpoints

Contact Info

Product

Resources

About