Abstract. We address the problem of contour-based perceptual grouping using a user-defined vocabulary of simple part models. We train a family of classifiers on the vocabulary, and apply them to a region oversegmentation of the input image to detect closed contours that are consistent with some shape in the vocabulary. Given such a set of consistent cycles, they are both abstracted and categorized through a novel application of an active shape model also trained on the vocabulary. From an image of a real object, our framework recovers the projections of the abstract surfaces that comprise an idealized model of the object. We evaluate our framework on a newly constructed dataset annotated with a set of ground truth abstract surfaces.
Recent work in the object recognition community has yielded a class of interest-point-based features that are stable under significant changes in scale, viewpoint, and illumination, making them ideally suited to landmark-based navigation. Although many such features may be visible in a given view of the robot's environment, only a few such features are necessary to estimate the robot's position and orientation. In this paper, we address the problem of automatically selecting, from the entire set of features visible in the robot's environment, the minimum (optimal) set by which the robot can navigate its environment. Specifically, we decompose the world into a small number of maximally sized regions, such that at each position in a given region, the same small set of features is visible. We introduce a novel graph theoretic formulation of the problem, and prove that it is NP-complete. Next, we introduce a number of approximation algorithms and evaluate them on both synthetic and real data. Finally, we use the decompositions from the real image data to measure the localization performance versus the undecomposed map.
We present a novel approach to recovering the qualitative 3-D part structure from a single 2-D image. We do not assume any knowledge of the objects contained in the scene, but rather assume that they're composed from a userdefined vocabulary of qualitative 3-D volumetric part categories input to the system. Given a set of 2-D part hypotheses recovered from an image, representing projections of the surfaces of the 3-D part categories, our method simultaneously perceptually groups subsets of the 2-D part hypotheses into 3-D part "views", from which the shape and pose parameters of the volumetric parts are recovered. The resulting 3-D parts and their relations offer the potential for a domain-independent, viewpoint-invariant shape indexing mechanism that can help manage the complexity of recognizing an object from a large database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.