Fig. 1. Visualizations under natural lighting of four captured 1k resolution SVBRDFs estimated using our deep inverse rendering framework. The leather material (left) is reconstructed from just 2 input photographs captured with a mobile phone camera and flash, while the other materials are recovered from 20 input photographs. In this paper we present a unified deep inverse rendering framework for estimating the spatially-varying appearance properties of a planar exemplar from an arbitrary number of input photographs, ranging from just a single photograph to many photographs. The precision of the estimated appearance scales from plausible when the input photographs fails to capture all the reflectance information, to accurate for large input sets. A key distinguishing feature of our framework is that it directly optimizes for the appearance parameters in a latent embedded space of spatially-varying appearance, such that no handcrafted heuristics are needed to regularize the optimization. This latent embedding is learned through a fully convolutional auto-encoder that has been designed to regularize the optimization. Our framework not only supports an arbitrary number of input photographs, but also at high
Figure 1: Affinity-based edit propagation methods such as allow one to change the appearance of an image or video (e.g., the color of the bird here) using only a few strokes, yet consuming prohibitive amount of time and memory for large data (e.g., 48 minutes and 23GB for this video containing 61M pixels). Our approximation scheme drastically reduces the cost of edit propagation methods (to 8 seconds and 22MB in this example) by exploring adaptive clustering in the affinity space. Video courtesy of BBC Motion Gallery (UK). AbstractImage/video editing by strokes has become increasingly popular due to the ease of interaction. Propagating the user inputs to the rest of the image/video, however, is often time and memory consuming especially for large data. We propose here an efficient scheme that allows affinity-based edit propagation to be computed on data containing tens of millions of pixels at interactive rate (in matter of seconds). The key in our scheme is a novel means for approximately solving the optimization problem involved in edit propagation, using adaptive clustering in a high-dimensional, affinity space. Our approximation significantly reduces the cost of existing affinitybased propagation methods while maintaining visual fidelity, and enables interactive stroke-based editing even on high resolution images and long video sequences using commodity computers.
Figure 1: Affinity-based edit propagation methods such as allow one to change the appearance of an image or video (e.g., the color of the bird here) using only a few strokes, yet consuming prohibitive amount of time and memory for large data (e.g., 48 minutes and 23GB for this video containing 61M pixels). Our approximation scheme drastically reduces the cost of edit propagation methods (to 8 seconds and 22MB in this example) by exploring adaptive clustering in the affinity space. Video courtesy of BBC Motion Gallery (UK). AbstractImage/video editing by strokes has become increasingly popular due to the ease of interaction. Propagating the user inputs to the rest of the image/video, however, is often time and memory consuming especially for large data. We propose here an efficient scheme that allows affinity-based edit propagation to be computed on data containing tens of millions of pixels at interactive rate (in matter of seconds). The key in our scheme is a novel means for approximately solving the optimization problem involved in edit propagation, using adaptive clustering in a high-dimensional, affinity space. Our approximation significantly reduces the cost of existing affinitybased propagation methods while maintaining visual fidelity, and enables interactive stroke-based editing even on high resolution images and long video sequences using commodity computers.
including regression-based and learning-based methods, have been explored to achieve better rendering quality with less computational cost. However, most of these methods rely on handcrafted optimization objectives, which lead to artifacts such as blurs and unfaithful details. In this paper, we present an adversarial approach for denoising Monte Carlo rendering. Our key insight is that generative adversarial networks can help denoiser networks to produce more realistic high-frequency details and global illumination by learning the distribution from a set of high-quality Monte Carlo path tracing images. We also adapt a novel feature modulation method to utilize auxiliary features better, including normal, albedo and depth. Compared to previous state-of-the-art methods, our approach produces a better reconstruction of the Monte Carlo integral from a few samples, performs more robustly at different sample rates, and takes only a second for megapixel images.
1 SG, 9.2%, 180 fps 3 SGs, 6.0% , 76 fps 5 SGs, 3.4 %, 55 fps 7 SGs, 2.0% , 36 fps 9 SGs, 1.1% , 27 fps 1 ASG 11 SGs, 0.54% , 22 fps 13 SGs, 0.26% , 19 fps 15 SGs, 0.11% , 17 fps 1 ASG, 0.10% , 125 fps reference Figure 1: Comparison of the SG (Spherical Gaussian) based approximation with the ASG (Anisotropic Spherical Gaussian) based approximation in rendering a highly anisotropic metal dish, under an environment light and two local lights. The BRDF of the metal dish is approximated by different number of ASGs or SGs in different images. Notice the superior property of ASGs over SGs. The result generated by 1 ASG already matches the path-traced reference well (with a L 2 error of 0.10%), and achieves a high framerate of 125 fps, while, to achieve a similar quality, more than 10 SGs are required, but with much lower framerates (19 fps for 13 SGs or 17 fps for 15 SGs). The L 2 error and the framerates for each configuration are also given in the corresponding subtitle. AbstractWe present a novel anisotropic Spherical Gaussian (ASG) function, built upon the Bingham distribution [Bingham 1974], which is much more effective and efficient in representing anisotropic spherical functions than Spherical Gaussians (SGs). In addition to retaining many desired properties of SGs, ASGs are also rotationally invariant and capable of representing all-frequency signals. To further strengthen the properties of ASGs, we have derived approximate closed-form solutions for their integral, product and convolution operators, whose errors are nearly negligible, as validated by quantitative analysis. Supported by all these operators, ASGs can be adapted in existing SG-based applications to enhance their scalability in handling anisotropic effects. To demonstrate the accuracy and efficiency of ASGs in practice, we have applied ASGs in two important SG-based rendering applications and the experimental results clearly reveal the merits of ASGs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.