The conversion of existing 2D images to 3D is proving commercially viable and fulfills the growing need for high quality stereoscopic images. This approach is particularly effective when creating content for the new generation of autostereoscopic displays that require multiple stereo images. The dominant technique for such content conversion is to develop a depth map for each frame of 2D material. The use of a depth map as part of the 2D to 3D conversion process has a number of desirable characteristics:1. The resolution of the depth map may be lower than that of the associated 2D image; 2. It can be highly compressed; 3. 2D compatibility is maintained; and 4. Real time generation of stereo, or multiple stereo pairs, is possible.The main disadvantage has been the laborious nature of the manual conversion techniques used to create depth maps from existing 2D images, which results in a slow and costly process. An alternative, highly productive technique has been developed based upon the use of Machine Leaning Algorithms (MLAs). This paper describes the application of MLAs to the generation of depth maps and presents the results of the commercial application of this approach.
This paper describes progress towards engineering a combined GIS and image understanding system, designed to analyse remotely-sensed imagery in a taskoriented manner, with the task and contextual parameters being supplied by the GIS. The overall aim is to evaluate some of the more recent developments in the field of computer vision and to show how they may be integrated (along with more traditional methods of image interpretation) to improve system performance in terms of object recognition and extraction. Three aspects are reported: (1) Development of a formal notation as the basis of describing the process of transforming data from low level image representations to high level object representations; (2) Design of collaborating low and high level image processes used in this abstraction process and their control via Process Networks encapsulated into frames; and (3) Use of Dempster-Shafer Decision Theory as the basis for combining different types of evidence to improve the recognition of objects in the scene. Some sample results are presented to illustrate the issues raised and mechanisms for the co-operative processing of data are suggested.
The MPEG-4 standard generated a need for the extraction of Video Object Planes for usage in video retrieval and description. Two later standards MPEG-7 and MPEG-21 further the need for systems requiring minimal user interaction for the accurate extraction of semantic video objects. Many previous approaches have either relied too heavily on user interaction or made compromises in end accuracy to achieve a faster segmentation process. With the advancement of computer processing power we propose a higher quality segmentation process with end application requiring a hardware implementation. The focus of our approach is to achieve a reliable and high quality segmentation mask per frame using sophisticated offline techniques to minimise the user interaction within the process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.