In this paper we present a scalable 3D video framework for capturing and rendering dynamic scenes. The acquisition system is based on multiple sparsely placed 3D video bricks, each comprising a projector, two grayscale cameras, and a color camera. Relying on structured light with complementary patterns, texture images and pattern-augmented views of the scene are acquired simultaneously by time-multiplexed projections and synchronized camera exposures. Using space-time stereo on the acquired pattern images, high-quality depth maps are extracted, whose corresponding surface samples are merged into a view-independent, point-based 3D data structure. This representation allows for effective photo-consistency enforcement and outlier removal, leading to a significant decrease of visual artifacts and a high resulting rendering quality using EWA volume splatting. Our framework and its view-independent representation allow for simple and straightforward editing of 3D video. In order to demonstrate its flexibility, we show compositing techniques and spatiotemporal effects.
3D video billboard clouds reconstruct and represent a dynamic three-dimensional scene using displacementmapped billboards. They consist of geometric proxy planes augmented with detailed displacement maps and combine the generality of geometry-based 3D video with the regularization properties of image-based 3D video. 3D video billboards are an image-based representation placed in the disparity space of the acquisition cameras and thus provide a regular sampling of the scene with a uniform error model. We propose a general geometry filtering framework which generates time-coherent models and removes reconstruction and quantization noise as well as calibration errors. This replaces the complex and time-consuming sub-pixel matching process in stereo reconstruction with a bilateral filter. Rendering is performed using a GPU-accelerated algorithm which generates consistent view-dependent geometry and textures for each individual frame. In addition, we present a semi-automatic approach for modeling dynamic three-dimensional scenes with a set of multiple 3D video billboards clouds.
We present a generic and versatile framework for interactive editing of 3D video footage. Our framework combines the advantages of conventional 2D video editing with the power of more advanced, depth-enhanced 3D video streams. Our editor takes 3D video as input and writes both 2D or 3D video formats as output. Its underlying core data structure is a novel 4D spatio-temporal representation which we call the video hypervolume. Conceptually, the processing loop comprises three fundamental operators: slicing, selection, and editing. The slicing operator allows users to visualize arbitrary hyperslices from the 4D data set. The selection operator labels subsets of the footage for spatio-temporal editing. This operator includes a 4D graph-cut based algorithm for object selection.The actual editing operators include cut & paste, affine transformations, and compositing with other media, such as images and 2D video. For high-quality rendering, we employ EWA splatting with view-dependent texturing and boundary matting. We demonstrate the applicability of our methods to post-production of 3D video.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.