It is often a difficult task to accurately segment images with intensity inhomogeneity, because most of representative algorithms are region-based that depend on intensity homogeneity of the interested object. In this paper, we present a novel level set method for image segmentation in the presence of intensity inhomogeneity. The inhomogeneous objects are modeled as Gaussian distributions of different means and variances in which a sliding window is used to map the original image into another domain, where the intensity distribution of each object is still Gaussian but better separated. The means of the Gaussian distributions in the transformed domain can be adaptively estimated by multiplying a bias field with the original signal within the window. A maximum likelihood energy functional is then defined on the whole image region, which combines the bias field, the level set function, and the piecewise constant function approximating the true image signal. The proposed level set method can be directly applied to simultaneous segmentation and bias correction for 3 and 7T magnetic resonance images. Extensive evaluation on synthetic and real-images demonstrate the superiority of the proposed method over other representative algorithms.
The human visual system (HVS) can reliably perceive salient objects in an image, but, it remains a challenge to computationally model the process of detecting salient objects without prior knowledge of the image contents. This paper proposes a visual-attention-aware model to mimic the HVS for salient-object detection. The informative and directional patches can be seen as visual stimuli, and used as neuronal cues for humans to interpret and detect salient objects. In order to simulate this process, two typical patches are extracted individually and in parallel from the intensity channel and the discriminant color channel, respectively, as the primitives. In our algorithm, an improved wavelet-based salient-patch detector is used to extract the visually informative patches. In addition, as humans are sensitive to orientation features, and as directional patches are reliable cues, we also propose a method for extracting directional patches. These two different types of patches are then combined to form the most important patches, which are called preferential patches and are considered as the visual stimuli applied to the HVS for salient-object detection. Compared with the state-of-the-art methods for salient-object detection, experimental results using publicly available datasets show that our produced algorithm is reliable and effective.
Most existing person re-identification (ReID) methods rely only on the spatial appearance information from either one or multiple person images, whilst ignore the space-time cues readily available in video or image-sequence data. Moreover, they often assume the availability of exhaustively labelled cross-view pairwise data for every camera pair, making them non-scalable to ReID applications in real-world large scale camera networks. In this work, we introduce a novel video based person ReID method capable of accurately matching people across views from arbitrary unaligned image-sequences without any labelled pairwise data. Specifically, we introduce a new space-time person representation by encoding multiple granularities of spatio-temporal dynamics in form of time series. Moreover, a Time Shift Dynamic Time Warping (TS-DTW) model is derived for performing automatically alignment whilst achieving data selection and matching between inherently inaccurate and incomplete sequences in a unified way. We further extend the TS-DTW model for accommodating multiple feature-sequences of an image-sequence in order to fuse information from different descriptions. Crucially, this model does not require pairwise labelled training data (i.e. unsupervised) therefore readily scalable to large scale camera networks of arbitrary camera pairs without the need for exhaustive data annotation for every camera pair. We show the effectiveness and advantages of the proposed method by extensive comparisons with related state-of-the-art approaches using two benchmarking ReID datasets, PRID2011 and iLIDS-VID.
In video surveillance, the captured face images are usually of low resolution. Thus, a framework based on singular value decomposition (SVD) for performing both face hallucination and recognition simultaneously is proposed in this paper. Conventionally, low-resolution (LR) face recognition is carried out by super-resolving the LR input face first, and then performing face recognition to identify the input face. By considering face hallucination and recognition simultaneously, the accuracy of both the hallucination and the recognition can be improved. In this paper, singular values are first proved to be effective for representing face images, and the singular values of a face image at different resolutions have approximately a linear relation. In our algorithm, each face image is represented by using SVD. For each LR input face, the corresponding LR and high-resolution (HR) face-image pairs can then be selected from the face gallery. Based on these selected LR-HR pairs, the mapping functions for interpolating the two matrices in the SVD representation for the reconstruction of HR face images can be learned more accurately. Therefore, the final estimation of the high-frequency details of the HR face images will become more reliable and effective. Experimental results demonstrate that our proposed framework can achieve promising results for both face hallucination and recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.