Stereo images have been captured primarily for 3D reconstruction in the past. However, the depth information acquired from stereo can also be used along with saliency to highlight certain objects in a scene. This approach can be used to make still images more interesting to look at, and highlight objects of interest in the scene. We introduce this novel direction in this paper, and discuss the theoretical framework behind the approach. Even though we use depth from stereo in this work, our approach is applicable to depth data acquired from any sensor modality. Experimental results on both indoor and outdoor scenes demonstrate the benefits of our algorithm.Index Terms-Focus/Defocus Processing, Depth from Stereo, Segmentation
INTRODUCTIONDepth-of-Field (DoF) is an essential component in producing photorealistic effects during image capture. DoF enhances images not only by giving people the "feeling" of depth information but also allows them to focus on the important regions. DoF makes images look more natural. It is essential in making the focus points of the image stand out, by emphasizing the foreground and de-emphasizing the background [1]. While taking a photo with an optical camera, we can vary the size of the aperture to set the DoF or "zone of focus" for the photo. The points within the DoF appear to be focused, while other points far away from the focal plane are blurred. Photos with a small zone of focus are said to have a "shallow" DoF. Thus, to highlight objects of interest in a photo, we should select a shallow DoF containing only these objects, also known as "Region-of-Interest" (ROI) or "foreground." However, there is often a requirement to render shallow DoF effects during post-processing, such as during photo retouching by a professional photographer in his or her studio, or during cinematographic editing sessions. Furthermore, in the areas of virtual reality and video games, real-time DoF effects are also important aspects in enhancing the visual effects. Thus, simulating DoF effects have become an important research topic in the field of computer vision.Most DoF rendering algorithms post-process single-view images, and can be roughly categorized as object-space-based and image-space-based methods. Lee's method [2] performs image blurring via non-linear interpolation of mipmap images generated from a pinhole image. Multilayer approaches like our proposed method split an image into layers based on pixel depths. In others, hidden image areas are approximated by color extrapolation to solve the partial visibility problem via Fourier transform, pyramid image processing, anisotropic diffusion, splatting, rectangle spreading, and so on [1].Saliency is widely used to investigate visual attention. The biologically inspired method by Itti [3] determines image saliency using Difference-of-Gaussians. Itti's method was later extended with graph-based normalization to build the visual saliency model [4]. Other methods use frequency domain processing, or combine global contrast and spatial relationship ...