Deep networks have recently enjoyed enormous success when applied to recognition and classification problems in computer vision [20,29], but their use in graphics problems has been limited ([21, 7] are notable recent exceptions). In this work, we present a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets. In contrast to traditional approaches which consist of multiple complex stages of processing, each of which require careful tuning and can fail in unexpected ways, our system is trained end-to-end. The pixels from neighboring views of a scene are presented to the network which then directly produces the pixels of the unseen view. The benefits of our approach include generality (we only require posed image sets and can easily apply our method to different domains), and high quality results on traditionally difficult scenes. We believe this is due to the end-to-end nature of our system which is able to plausibly generate pixels according to color, depth, and texture priors learnt automatically from the training data. To verify our method we show that it can convincingly reproduce known test views from nearby imagery. Additionally we show images rendered from novel viewpoints. To our knowledge, our work is the first to apply deep learning to the problem of new view synthesis from sets of real-world, natural imagery.
We present a fast, approximative solution for image-based lighting of curve-based hair, capturing both diffuse and specular reflection with occlusion. Our technique draws on and extends a hair selfshadowing model originally developed for traditional point-source lighting.
No abstract
Rhythm and Hues StudiosWe present a markerless facial motion capture algorithm based on nonlinear optimization of a texture-based error metric that eliminates the need for precise camera calibration and provides flexible controls for tuning the optimization. ImplementationThe use of nonlinear optimization for estimating facial animation parameters is quite popular, as evidenced by [Paterson and Fitzgibbon 2003] and [Williams 2005]. However, these approaches include precise camera calibration and model tracking either as prerequisites or as part of the problem space while ours requires only approximations.We use a polygonal facial model driven by a control rig consisting of 10-50 scalar parameters that deform the model by blending between rest poses or driving a muscle simulation. Any scalar whose variation continuously affects the appearance of any part of the face is a suitable rig parameter.Our solver adjusts the rig parameters in an attempt to minimize a discrepancy metric between successive frames. We calculate this metric by constructing textures that capture the camera projection of the reference footage onto the deformed model at each frame, taking visibility into account. The textures are efficiently computed using a specialized tool called Primitex, which is integrated into our solver. PrimitexPrimitex, an acronym for "PRoject IMage Into TEXture", is a tool for efficiently projecting a "plate" or perspective camera's image of a model-in this case, live footage-into a texture space defined within the model. This technique requires a polygonal model with suitable (bijective) texture coordinates and an approximately calibrated camera in addition to the plate. We rasterize selected parts of the model into their native texture space, keeping track of interpolated positions and normals. To color each output pixel, we sample the plate based on the interpolated position and given camera data. Backface culling and ray tracing detect hidden surfaces, whose corresponding texture pixels are marked invalid.Working in texture space makes our method highly tolerant of camera calibration errors: as long as the camera sees most of the same geometry in adjacent frames, the textures that Primitex outputs will have many pixels in common, provided that the geometry was deformed to match the plates at both frames, and the lighting conditions were consistent. Our texture-based error metric enables the user to easily mask out specific materials or texture regions from consideration (e.g. use a marquee to select a part of the model in texture space). This expedites the solution and improves accuracy by eliminating the residual noise resulting from the solver's attempt to adjust rig parameters that the user knows a priori to be irrelevant to a given area of interest on the model. Nonlinear Optimization and Pyramid SolverOur goal at each frame is to deform the facial geometry by setting the facial rig parameters in such a way as to generate a Primitex texture that differs minimally from the previous frame's texture. We use the Lev...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.