Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge-preserving filter. In this paper, we propose a generic and simple framework comprising three steps: 1) constructing a cost volume, 2) fast cost volume filtering, and 3) Winner-Takes-All label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve 1) disparity maps in real time whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and 2) optical flow fields which contain very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.
Local stereo matching has recently experienced large progress by the introduction of new support aggregation schemes. These approaches estimate a pixel's support region via color segmentation. Our contribution lies in an improved method for accomplishing this segmentation. Inside a square support window, we compute the geodesic distance from all pixels to the window's center pixel. Pixels of low geodesic distance are given high support weights and therefore large influence in the matching process. In contrast to previous work, we enforce connectivity by using the geodesic distance transform. For obtaining a high support weight, a pixel must have a path to the center point along which the color does not change significantly. This connectivity property leads to improved segmentation results and consequently to improved disparity maps. The success of our geodesic approach is demonstrated on the Middlebury images. According to the Middlebury benchmark, the proposed algorithm is the top performer among local stereo methods at the current state-of-the-art.
This paper addresses the problem of extracting an alpha matte from a single photograph given a user-defined trimap. A crucial part of this task is the color modeling step where for each pixel the optimal alpha value, together with its confidence, is estimated individually. This forms the data term of the objective function. It comprises of three steps: (i) Collecting a candidate set of potential fore-and background colors; (ii) Selecting high confidence samples from the candidate set; (iii) Estimating a sparsity prior to remove blurry artifacts. We introduce novel ideas for each of these steps and show that our approach considerably improves over state-of-the-art techniques by evaluating it on a large database of 54 images with known high-quality ground truth.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.