This paper presents a database containing 'ground truth' segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties.
The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness, color, and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, we train a classifier using human labeled images as ground truth. The output of this classifier provides the posterior probability of a boundary at each image location and orientation. We present precision-recall curves showing that the resulting detector significantly outperforms existing approaches. Our two main results are 1) that cue combination can be performed adequately with a simple linear model and 2) that a proper, explicit treatment of texture is required to detect boundaries in natural images.
Figure-ground organization refers to the visual perception that a contour separating two regions belongs to one of the regions. Recent studies have found neural correlates of figure-ground assignment in V2 as early as 10-25 ms after response onset, providing strong support for the role of local bottom-up processing. How much information about figure-ground assignment is available from locally computed cues? Using a large collection of natural images, in which neighboring regions were assigned a figure-ground relation by human observers, we quantified the extent to which figural regions locally tend to be smaller, more convex, and lie below ground regions. Our results suggest that these Gestalt cues are ecologically valid, and we quantify their relative power. We have also developed a simple bottom-up computational model of figure-ground assignment that takes image contours as input. Using parameters fit to natural image statistics, the model is capable of matching human-level performance when scene context limited.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.