A vastly growing number of productions from the entertainment industry are aiming at 3D movie theatres. These productions use a two-view format, primarily intended for eye-wear assisted viewing in a well defined environment. To get this 3D content into the home environment, where a large variety of 3D viewing conditions exists (e.g different display sizes, display types, viewing distance), we need a flexible 3D format that can adjust the depth effect. Such a format is the image plus depth format in which a video frame is enriched with depth information of all pixels in the video. This format can be extended with an additional layer for occluded video and associated depth, that contains what is behind objects in the video. To produce 3D content in this extended format, one has to deduce what is behind objects. There are various axes along which this occluded data can be obtained. This paper presents a method to automatically detect and fill the occluded areas exploiting the temporal axis. To get visually pleasing results, it is of utmost importance to make the inpainting globally consistent. To do so, we start by analyzing data along the temporal axis and compute a confidence for each pixel. Then pixels from the future and the past that are not visible in the current frame are weighted and accumulated based on computed confidences. These results are then fed to a generic multi-source framework that computes the occlusion layer based on the available confidences and occlusion data.
In distributed video source coding side-information at the decoder is generated as a temporal prediction based on previous frames. This creates a virtual dependency channel between the source video at the encoder and the side information at the decoder. In recent years, distributed source coders were introduced with sophisticated error correction codes, like Turbo Codes and LDPC codes. Although these codes performs well on noisy network communication channels, it is far from obvious that these codes can handle the non-stationary noise in the dependency channel as encountered in distributed video coders. In this paper we study the consequences of inaccurate modeling of the dependency channel on Turbo and LDPC coding and show that the performance depends greatly on the choice of the probabilistic model for the dependency channel. The results show that LDPC codes are less sensitive to inaccuracies in the dependency channel models.
In distributed video coding, the complexity of the video encoder is reduced at the cost of a more complex video decoder. Using the principles of Slepian and Wolf, video compression is then carried out using channel coding principles, under the assumption that the video decoder can temporally predict side-information that is correlated with the source video frames. In recent work on distributed video coding the application of turbo codes has been studied. Turbo codes perform well in typical (tele-)communications settings. However, in distributed video coding the dependency channel between source and side-information is inherently non-stationary, for instance due to occluded regions in the video frames. In this paper, we study the modeling of the virtual dependency channel, as well as the consequences of incorrect model assumptions on the turbo decoding process. We observe a strong dependency of the performance of the distributed video decoder on the model of the dependency channel.
Philips is developing a product line of multi-view auto-stereoscopic 3D displays. 1 For interfacing, the image-plus-depth format is used. 2, 3 Being independent of specific display properties, such as number of views, view mapping on pixel grid, etc., this interface format allows optimal multi-view visualisation of content from many different sources, while maintaining interoperability between display types. A vastly growing number of productions from the entertainment industry are aiming at 3D movie theatres. These productions use a two view format, primarily intended for eye-wear assisted viewing. It has been shown 4 how to convert these sequences into the image-plus-depth format. This results in a single layer depth profile, lacking information about areas that are occluded and can be revealed by the stereoscopic parallax. Recently, it has been shown how to compute for intermediate views for a stereo pair. 4, 5 Unfortunately, these approaches are not compatible to the image-plus-depth format, which might hamper the applicability for broadcast 3D television. 3
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.