Ivan Neulander scite author profile

Deep networks have recently enjoyed enormous success when applied to recognition and classification problems in computer vision [20,29], but their use in graphics problems has been limited ([21, 7] are notable recent exceptions). In this work, we present a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets. In contrast to traditional approaches which consist of multiple complex stages of processing, each of which require careful tuning and can fail in unexpected ways, our system is trained end-to-end. The pixels from neighboring views of a scene are presented to the network which then directly produces the pixels of the unseen view. The benefits of our approach include generality (we only require posed image sets and can easily apply our method to different domains), and high quality results on traditionally difficult scenes. We believe this is due to the end-to-end nature of our system which is able to plausibly generate pixels according to color, depth, and texture priors learnt automatically from the training data. To verify our method we show that it can convincingly reproduce known test views from nearby imagery. Additionally we show images rendered from novel viewpoints. To our knowledge, our work is the first to apply deep learning to the problem of new view synthesis from sets of real-world, natural imagery.

show abstract

DeepStereo: Learning to Predict New Views from the World's Imagery

Flynn¹,

Neulander²,

Philbin³

et al. 2015

Preprint

View full text Add to dashboard Cite

Rendering with paintstrokes

Neulander¹

1997

View full text Add to dashboard Cite

Fast furry ray gathering

Neulander

2010

View full text Add to dashboard Cite

Quick image-based lighting of hair

Neulander

2004

View full text Add to dashboard Cite

show abstract

Smoother subsurface scattering

Neulander

2009

View full text Add to dashboard Cite

Pixmotor

Neulander

2007

View full text Add to dashboard Cite

Markerless facial motion capture using texture extraction and nonlinear optimization

Vendrovsky

Neulander

2006

View full text Add to dashboard Cite

Rhythm and Hues StudiosWe present a markerless facial motion capture algorithm based on nonlinear optimization of a texture-based error metric that eliminates the need for precise camera calibration and provides flexible controls for tuning the optimization. ImplementationThe use of nonlinear optimization for estimating facial animation parameters is quite popular, as evidenced by [Paterson and Fitzgibbon 2003] and [Williams 2005]. However, these approaches include precise camera calibration and model tracking either as prerequisites or as part of the problem space while ours requires only approximations.We use a polygonal facial model driven by a control rig consisting of 10-50 scalar parameters that deform the model by blending between rest poses or driving a muscle simulation. Any scalar whose variation continuously affects the appearance of any part of the face is a suitable rig parameter.Our solver adjusts the rig parameters in an attempt to minimize a discrepancy metric between successive frames. We calculate this metric by constructing textures that capture the camera projection of the reference footage onto the deformed model at each frame, taking visibility into account. The textures are efficiently computed using a specialized tool called Primitex, which is integrated into our solver. PrimitexPrimitex, an acronym for "PRoject IMage Into TEXture", is a tool for efficiently projecting a "plate" or perspective camera's image of a model-in this case, live footage-into a texture space defined within the model. This technique requires a polygonal model with suitable (bijective) texture coordinates and an approximately calibrated camera in addition to the plate. We rasterize selected parts of the model into their native texture space, keeping track of interpolated positions and normals. To color each output pixel, we sample the plate based on the interpolated position and given camera data. Backface culling and ray tracing detect hidden surfaces, whose corresponding texture pixels are marked invalid.Working in texture space makes our method highly tolerant of camera calibration errors: as long as the camera sees most of the same geometry in adjacent frames, the textures that Primitex outputs will have many pixels in common, provided that the geometry was deformed to match the plates at both frames, and the lighting conditions were consistent. Our texture-based error metric enables the user to easily mask out specific materials or texture regions from consideration (e.g. use a marquee to select a part of the model in texture space). This expedites the solution and improves accuracy by eliminating the residual noise resulting from the solver's attempt to adjust rig parameters that the user knows a priori to be irrelevant to a given area of interest on the model. Nonlinear Optimization and Pyramid SolverOur goal at each frame is to deform the facial geometry by setting the facial rig parameters in such a way as to generate a Primitex texture that differs minimally from the previous frame's texture. We use the Lev...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ivan Neulander

Deep Stereo: Learning to Predict New Views from the World's Imagery

DeepStereo: Learning to Predict New Views from the World's Imagery

Rendering with paintstrokes

Fast furry ray gathering

Quick image-based lighting of hair

Smoother subsurface scattering

Pixmotor

Markerless facial motion capture using texture extraction and nonlinear optimization

Contact Info

Product

Resources

About