Spatio-Temporal Phrases for Activity Recognition

Volume Local Binary Patterns are a well-known feature type to describe object characteristics in the spatiotemporal domain. Apart from the computation of a binary pattern further steps are required to create a discriminative feature. In this paper we propose different computation methods for Volume Local Binary Patterns. These methods are evaluated in detail and the best strategy is shown. A Random Forest is used to find discriminative patterns. The proposed methods are applied to the well-known and publicly available KTH dataset and Weizman dataset for singleview action recognition and to the IXMAS dataset for multiview action recognition. Furthermore, a comparison of the proposed framework to state-of-the-art methods is given.

show abstract

“…Laptev et al [19] Original split 91,80% Zhang et al [31] Original split 94,00% Wang et al [24] Original split 94,20% Proposed method…”

Section: Methodsmentioning

confidence: 99%

Computation strategies for volume local binary patterns applied to action recognition

Baumann

Ehlers

Rosenhahn

et al. 2014

2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

View full text Add to dashboard Cite

show abstract

“…In order to compensate for the loss of structures in local representations, a lot of methods try to improve local representations by exploring spatio-temporal structural information [33], including context information of each interest point [34,35], relationships between/among spatio-temporal interest points [36,37,38,39] and neighborhood-based features [40]. The relationship among visual words in the BoW model and their semantic meaning have also be explored to encode higherlevel features [15,41,42,43]. New local descriptors have also be developed [44,45] to improve the performance of local methods.…”

Section: The Bow Modelmentioning

confidence: 99%

“…Aiming to encode rich temporal ordering and spatial geometry information of local visual words, Zhang et al [41] proposed to model the mutual relationships among visual words by a novel concept named the spatio-temporal phrase (ST phrase). A ST phrase is defined as a combination of k words in a certain spatial and temporal structure including their order and relative positions.…”

Section: The Bow Modelmentioning

confidence: 99%

Action recognition via spatio-temporal local features: A comprehensive study

Zhen

Shao

2016

Image and Vision Computing

View full text Add to dashboard Cite

Local methods based on spatio-temporal interest points (STIPs) have shown their effectiveness for human action recognition. The bag-of-words (BoW) model has been widely used and dominated in this field. Recently, a large number of techniques based on local features including improved variants of the BoW model, sparse coding (SC), Fisher kernels (FK), vector of locally aggregated descriptors (VLAD) as well as the naive Bayes nearest neighbor (NBNN) classifier have been proposed and developed for visual recognition. However, some of them are proposed in the image domain and have not yet been applied to the video domain and it is still unclear how effectively these techniques would perform on action recognition. In this paper, we provide a comprehensive study on these local methods for human action recognition. We implement these techniques and conduct comparison under unified experimental settings on three widely used benchmarks, i.e., the KTH, UCF-YouTube and HMDB51 datasets. We discuss insightfully the findings from the experimental results and draw useful conclusions, which are expected to guide practical applications and future work for the action recognition community.

show abstract

“…Although the advantage of these approaches that use image descriptors is that they do not require skeleton or object tracks to describe the activity observed, they are unable to take into account spatiotemporal relations between the different relevant entities in the scene, which are important elements when learning and recognising human activities [25,17]. To address this issue the concept of a "spatio-temporal phrase" that is defined as a combination of local words in a certain spatial and temporal structure, including their order and relative positions is introduced [26]. This is a very similar approach to the graphs representation described before [12][13][14], however, the spatio-temporal phrase still does not include qualitative spatial relations and also the temporal relations are much fewer than the Allen's Interval Algebra used in the graphs method.…”

Section: Related Workmentioning

confidence: 99%

Qualitative and Quantitative Spatio-temporal Relations in Daily Living Activity Recognition

Tayyub

Tavanai

Gatsoulis

et al. 2015

Computer Vision -- ACCV 2014

View full text Add to dashboard Cite

Abstract. For the effective operation of intelligent assistive systems working in real-world human environments, it is important to be able to recognise human activities and their intentions. In this paper we propose a novel approach to activity recognition from visual data. Our approach is based on qualitative and quantitative spatio-temporal features which encode the interactions between human subjects and objects in an abstract and efficient manner. Unlike current state of the art approaches, our approach uses significantly fewer assumptions and does not require any knowledge about object types, their affordances, or the sub-level activities that high-level activities consist of. We perform an automatic feature selection process which provides the most representative descriptions of the learnt activities. We validated our method using these descriptions on the CAD-120 benchmark dataset consisting of video sequences showing humans performing daily real-world activities. The experimental results show the strength of our work which significantly outperforms the current state of the art benchmark.

show abstract

Spatio-Temporal Phrases for Activity Recognition

Cited by 91 publications

References 22 publications

Computation strategies for volume local binary patterns applied to action recognition

Computation strategies for volume local binary patterns applied to action recognition

Action recognition via spatio-temporal local features: A comprehensive study

Qualitative and Quantitative Spatio-temporal Relations in Daily Living Activity Recognition

Contact Info

Product

Resources

About