For one-shot learning gesture recognition, two important challenges are: how to extract distinctive features and how to learn a discriminative model from only one training sample per gesture class. For feature extraction, a new spatio-temporal feature representation called 3D enhanced motion scale-invariant feature transform (3D EMoSIFT) is proposed, which fuses RGB-D data. Compared with other features, the new feature set is invariant to scale and rotation, and has more compact and richer visual representations. For learning a discriminative model, all features extracted from training samples are clustered with the k-means algorithm to learn a visual codebook. Then, unlike the traditional bag of feature (BoF) models using vector quantization (VQ) to map each feature into a certain visual codeword, a sparse coding method named simulation orthogonal matching pursuit (SOMP) is applied and thus each feature can be represented by some linear combination of a small number of codewords. Compared with VQ, SOMP leads to a much lower reconstruction error and achieves better performance. The proposed approach has been evaluated on ChaLearn gesture database and the result has been ranked amongst the top best performing techniques on ChaLearn gesture challenge (round 2).
Abstract. In this paper, a region-based shock-diffusion equation is presented for image denoising and edge sharpening. An image is divided into three-type different regions according to image features: edges, textures and details, and flat areas. For edges, a shock-type backward diffusion is performed in the gradient direction to the isophote line (edge), incorporating a forward diffusion in the isophote line direction; while for textures and details, a soft backward diffusion is done to enhance image features preserving a natural transition. Moreover, an isotropic diffusion is used to smooth flat areas simultaneously. Finally, a shock capturing scheme with a special limiter function is developed to speed the process with numerical stability. Experiments on real images show that this method produces better visual results of the enhanced images than some related equations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.