Abstract-We propose an efficient and robust solution for image set classification. A joint representation of an image set is proposed which includes the image samples of the set and their affine hull model. The model accounts for unseen appearances in the form of affine combinations of sample images. To calculate the between-set distance, we introduce the Sparse Approximated Nearest Point (SANP). SANPs are the nearest points of two image sets such that each point can be sparsely approximated by the image samples of its respective set. This novel sparse formulation enforces sparsity on the sample coefficients and jointly optimizes the nearest points as well as their sparse approximations. Unlike standard sparse coding, the data to be sparsely approximated are not fixed. A convex formulation is proposed to find the optimal SANPs between two sets and the accelerated proximal gradient method is adapted to efficiently solve this optimization. We also derive the kernel extension of the SANP and propose an algorithm for dynamically tuning the RBF kernel parameter while matching each pair of image sets. Comprehensive experiments on the UCSD/Honda, CMU MoBo, and YouTube Celebrities face datasets show that our method consistently outperforms the state of the art.
We present a robust salient region detection framework based on the color and orientation distribution in images. The proposed framework consists of a color saliency framework and an orientation saliency framework. The color saliency framework detects salient regions based on the spatial distribution of the component colors in the image space and their remoteness in the color space. The dominant hues in the image are used to initialize an expectation-maximization (EM) algorithm to fit a Gaussian mixture model in the hue-saturation (H-S) space. The mixture of Gaussians framework in H-S space is used to compute the inter-cluster distance in the H-S domain as well as the relative spread among the corresponding colors in the spatial domain. Orientation saliency framework detects salient regions in images based on the global and local behavior of different orientations in the image. The oriented spectral information from the Fourier transform of the local patches in the image is used to obtain the local orientation histogram of the image. Salient regions are further detected by identifying spatially confined orientations and with the local patches that possess high orientation entropy contrast. The final saliency map is selected as either color saliency map or orientation saliency map by automatically identifying which of the maps leads to the correct identification of the salient region. The experiments are carried out on a large image database annotated with "ground-truth" salient regions, provided by Microsoft Research Asia, which enables us to conduct robust objective level comparisons with other salient region detection algorithms.
We formulate the problem of salient object detection in images as an automatic labeling problem on the vertices of a weighted graph. The seed (labeled) nodes are first detected using Markov random walks performed on two different graphs that represent the image. While the global properties of the image are computed from the random walk on a complete graph, the local properties are computed from a sparse k-regular graph. The most salient node is selected as the one which is globally most isolated but falls on a locally compact object. A few background nodes and salient nodes are further identified based upon the random walk based hitting time to the most salient node. The salient nodes and the background nodes will constitute the labeled nodes. A new graph representation of the image that represents the saliency between nodes more accurately, the "pop-out graph" model, is computed further based upon the knowledge of the labeled salient and background nodes. A semisupervised learning technique is used to determine the labels of the unlabeled nodes by optimizing a smoothness objective label function on the newly created "pop-out graph" model along with some weighted soft constraints on the labeled nodes.
We propose 'Contour Code', a novel representation and binary hash table encoding for multispectral palmprint recognition. We first present a reliable technique for the extraction of a region of interest (ROI) from palm images acquired with non-contact sensors. The Contour Code representation is then derived from the Nonsubsampled Contourlet Transform. A uniscale pyramidal filter is convolved with the ROI followed by the application of a directional filter bank. The dominant directional subband establishes the orientation at each pixel and the index corresponding to this subband is encoded in the Contour Code representation. Unlike existing representations which extract orientation features directly from the palm images, the Contour Code uses a two stage filtering to extract robust orientation features. The Contour Code is binarized into an efficient hash table structure that only requires indexing and summation operations for simultaneous one-to-many matching with an embedded score level fusion of multiple bands. We quantitatively evaluate the accuracy of the ROI extraction by comparison with a manually produced ground truth. Multispectral palmprint verification results on the PolyU and CASIA databases show that the Contour Code achieves an EER reduction upto 50%, compared to state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.