Equivariant Neural Rendering

Dupont, Emilien; Bautista, Miguel Ángel; Colburn, Alex; Sankar, Aditya; Guestrin, Carlos; Susskind, Josh; Qi, Shihua

doi:10.48550/arxiv.2006.07630

Cited by 6 publications

(23 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Learning to synthesize novel views of an object or a scene given one or more sparse observation has been widely studied in the literature [5,6,7,8,19,9,20,10,13,21,11,14]. A unifying problem definition for this set of approaches is to predict a target view given a source view/s, conditioned on a relative camera transformation.…”

Section: Related Workmentioning

confidence: 99%

“…A modification of such approaches was concurrently proposed in [13] and [21], where instead of predicting a 2D flow-field, these methods predict a latent 3D representation (in the form of a feature volume) to which the relative camera transformation from source to target view can be readily applied.…”

Section: Related Workmentioning

confidence: 99%

“…Once a feature map F s is obtained, we perform an inverse projection step to back-project F s into a latent volumetric tensor Z s ∈ R c×ds×hs×ws , where d s , h s , w s are depth, height and width for the volumetric representation 2 . Instead of reshaping 2D feature maps into a 3D volumetric representation like ENR [13], we found that using an inverse projection step is beneficial to preserve the 3D geometry and texture information (Cf. 4 for empirical evidence).…”

Section: Encodingmentioning

confidence: 99%

“…Being able to synthesize images at target camera viewpoints efficiently given sparse source views serves a fundamental purpose in building intelligent visual behaviour [2,3,4]. The problem of learning to synthesize novel views has been widely studied in literature, with approaches ranging from traditional small-baseline view synthesis relying on multi-plane imaging [5,6,7,8], flow estimation [9,10], to explicitly modeling 3D geometry via point-clouds [11], meshes [12], and voxels [13].…”

Section: Introductionmentioning

confidence: 99%

“…A recent wave of approaches for view synthesis have adopted continuous radiance field representations [1,14,15,16,17], where scenes are represented as a continuous function that shares its domain with the signal being fitted (e.g. a function that takes points in R 3 as input, to model a 3D signal), as opposed to discrete representations where the 3D signals are encoded in a discrete geometric structure like a volume [13] or a mesh [12]. Although continuous radiance field representations enjoy the benefits of being resolution-free or modeling view-dependent effects, they are not efficient for real-world use cases that require real-time performance.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Fast and Explicit Neural View Synthesis

Guo¹,

Bautista²,

Colburn³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

We study the problem of novel view synthesis of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision.Preprint. Under review.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Encodingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Fast and Explicit Neural View Synthesis

Guo¹,

Bautista²,

Colburn³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

Unconstrained Scene Generation with Locally Conditioned Radiance Fields

DeVries

Bautista

Srivastava

et al. 2021

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

103

View full text Add to dashboard Cite

show abstract

On the generalization of learning-based 3D reconstruction

Bautista¹,

Talbott²,

Zhai³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

State-of-the-art learning-based monocular 3D reconstruction methods learn priors over object categories on the training set, and as a result struggle to achieve reasonable generalization to object categories unseen during training. In this paper we study the inductive biases encoded in the model architecture that impact the generalization of learning-based 3D reconstruction methods. We find that 3 inductive biases impact performance: the spatial extent of the encoder, the use of the underlying geometry of the scene to describe point features, and the mechanism to aggregate information from multiple views. Additionally, we propose mechanisms to enforce those inductive biases: a point representation that is aware of camera position, and a variance cost to aggregate information across views. Our model achieves state-of-the-art results on the standard ShapeNet 3D reconstruction benchmark in various settings.

show abstract

Equivariant Neural Rendering

Cited by 6 publications

References 26 publications

Fast and Explicit Neural View Synthesis

Fast and Explicit Neural View Synthesis

Unconstrained Scene Generation with Locally Conditioned Radiance Fields

On the generalization of learning-based 3D reconstruction

Contact Info

Product

Resources

About