Dominant spatio-temporal modulations and energy tracking in videos: Application to interest point detection for action recognition

Georgakis, Christos; Maragos, Petros; Evangelopoulos, Georgios; Dimitriadis, Dimitrios

doi:10.1109/icip.2012.6466966

Cited by 5 publications

(6 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…1 Dominant component analysis on the outputs of Gabor filterbanks has been used for 2D texture analysis and segmentation in [91,90,92] and for spatio-temporal action classification in [93,94]. It may include additional steps of demodulation.…”

Section: Postprocessingmentioning

confidence: 99%

A perceptually based spatio-temporal computational framework for visual saliency estimation

Koutras

Maragos

2015

Signal Processing: Image Communication

Self Cite

View full text Add to dashboard Cite

Section: Postprocessingmentioning

confidence: 99%

A perceptually based spatio-temporal computational framework for visual saliency estimation

Koutras

Maragos

2015

Signal Processing: Image Communication

Self Cite

View full text Add to dashboard Cite

“…1, the energy outputs of all 400 filters are handled by some operator in order to obtain the final energy map of each video. We used some ideas from Dominant Energy Analysis (DEA), as in [12], where the energy of the most dominant channel is considered as the energy value in each voxel:…”

Section: ] ψ[S(t)] ≡ [S (T)] 2 −S(t)s (T)mentioning

confidence: 99%

“…Video representations in terms of such features exhibit efficiency in distinguishing among action classes, while bypassing the need for precise background subtraction or tracking. Local image and video features have been successfully used for many tasks such as object and scene recognition [23] as well as human action recognition [9,12,24,32]. Local spatio-temporal features are able to capture characteristic shape and motion in video.…”

Section: Introductionmentioning

confidence: 99%

“…Georgakis et al [12] introduced the DCA3D detector, which is based on multichannel filtering via Gabor filters and Dominant Component Analysis. First, the filtering process takes place in the 2 spatial dimensions and then the dominant component volume is filtered by temporal filters to lead to the final energy volume and the selection of interest points.Wang et al [35] introduced dense trajectories and motion boundary histograms to deal with the task of action recognition.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Advances on action recognition in videos using an interest point detector based on multiband spatio-temporal energies

Maninis

Koutras

Maragos

2014

2014 IEEE International Conference on Image Processing (ICIP)

Self Cite

View full text Add to dashboard Cite

This paper proposes a new visual framework for action recognition in videos, that consists of an energy detector coupled with a carefully designed multiband energy based filterbank. The tracking of video energy is performed using perceptually inspired 3D Gabor filters combined with ideas from Dominant Energy Analysis. Within this framework, we utilize different alternatives such as non-linear energy operators where actions are implicitly considered as manifestations of spatio-temporal oscillations in the dynamic visual stream. Texture and motion decomposition of actions through multiband filtering is the basis of our approach. This new energybased saliency measure of action videos leads to the extraction of local spatio-temporal interest points that give promising results for the task of action recognition. Such interest points are processed further in order to formulate a robust representation of an action in a video. Theoretical formulation is supported by evaluation in two popular action databases, in which our method seems to outperform the state of the art.Index Terms-Human action recognition, spatio-temporal interest point detectors, multiband Gabor filtering, dominant energy analysis, energy tracking in videos.

show abstract

“…Traditionally, research in behavior and affect analysis has focused on recognizing behavioral cues such as smiles, head nods, and laughter (Déniz et al 2008;Kawato and Ohya 2000;Lockerd and Mueller 2002), pre-defined posed human actions (e.g., walking, running, and hand-clapping) (Dollár et al 2005;Niebles et al 2008;Georgakis et al 2012) or discrete, basic emotional states (e.g., happiness, sadness) (Pantic and Rothkrantz 2000;Cohen et al 2003;Littlewort et al 2006) mainly from posed data acquired in laboratory settings. However, these models are deemed unrealistic as they are unable to capture the temporal evolution of non-basic, possibly atypical, behaviors and subtle affective states exhibited by humans in naturalistic settings.…”

Section: Introductionmentioning

confidence: 99%

Dynamic Behavior Analysis via Structured Rank Minimization

2017

Self Cite

View full text Add to dashboard Cite

Human behavior and affect is inherently a dynamic phenomenon involving temporal evolution of patterns manifested through a multiplicity of non-verbal behavioral cues including facial expressions, body postures and gestures, and vocal outbursts. A natural assumption for human behavior modeling is that a continuous-time characterization of behavior is the output of a linear time-invariant system when behavioral cues act as the input (e.g., continuous rather than discrete annotations of dimensional affect). Here we study the learning of such dynamical system under real-world conditions, namely in the presence of noisy behavioral cues descriptors and possibly unreliable annotations by employing structured rank minimization. To this end, a novel structured rank minimization method and its scalable variant are proposed. The generalizability of the proposed framework is demonstrated by conducting experiments on 3 distinct dynamic behavior analysis tasks, namely (i) conflict intensity prediction, (ii) prediction of valence and arousal, and (iii)

show abstract

Dominant spatio-temporal modulations and energy tracking in videos: Application to interest point detection for action recognition

Cited by 5 publications

References 16 publications

A perceptually based spatio-temporal computational framework for visual saliency estimation

A perceptually based spatio-temporal computational framework for visual saliency estimation

Advances on action recognition in videos using an interest point detector based on multiband spatio-temporal energies

Dynamic Behavior Analysis via Structured Rank Minimization

Contact Info

Product

Resources

About