International audienceThis paper presents a new tensor motion descriptor only using optical flow and HOG3D information: no interest points are extracted and it is not based on a visual dictionary. We propose a new aggregation technique based on tensors. This is a double aggregation of tensor descriptors. The first one represents motion by using polynomial coefficients which approximates the optical flow. The other represents the accumulated data of all histograms of gradients of the video. The descriptor is evaluated by a classification of KTH, UCF11 and Hollywood2 datasets, using a SVM classifier. Our method reaches 93.2% of recognition rate with KTH, comparable to the best local ap- proaches. For the UCF11 and Hollywood2 datasets, our recognition achieves fairly competitive results compared to local and learning based approaches. Keywords: Global motion descriptor, optical flow, histogram of gradients, action recognitio
Motion is one of the main characteristics that describe the semantic information of videos. In this work, a global video descriptor based on orientation tensors is proposed. This descriptor is obtained by combining polynomial coefficients calculated for each image in a video. The coefficients are found through the projection of the optical flow on Legendre polynomials, reducing the dimension of per frame motion estimations. The sequence of coefficients are then combined using orientation tensors. The global tensor descriptor created is evaluated by a classification of the KTH video database with a SVM classifier.
Abstract. We propose a method for the assessment and visualization of high frequency regions of a multiresolution image. We combine both orientation tensor and multiresolution analysis to give a scalar descriptor of high frequency regions. High values of this scalar space indicate regions having coincident detail vectors in multiple scales of a wavelet decomposition. This is useful for finding edges, textures, collinear structures and salient regions for computer vision methods. The image is decomposed into several scales using the Discrete Wavelet Transform (DWT). The resulting detail spaces form vectors indicating intensity variations which are combined using orientation tensors. A high frequency scalar descriptor is then obtained from the resulting tensor for each original image pixel. Our results show that this descriptor indicates areas having relevant intensity variation in multiple scales.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.