Although the methods based on spatio-temporal interest points have shown promising results for human action recognition, they are not robust in complex scenes especially background clutter, camera motion, occlusions and illumination variations. In this paper, we propose a novel method to classify human actions in complex scenes. We suppress the false detection interest points by detecting salient regions. Furthermore, we encode the features according to their spatio-temporal relationship. Our method is verified on two challenging databases (UCF sports and YouTube), and the experimental results demonstrate that our method achieves better results than previous methods in human action recognition.
Keywords Human action recognition • Salient region detection • Complex scenes
IntroductionAutomatically recognizing human actions is receiving increasing attention due to its wide range of applications such as video retrieval, human-computer interaction and activity monitoring. A large number of methods [1,2] for humane action recognition have been proposed, ranging from trajectory-based methods [3] and local descriptor-based methods [4] to attribute-based method [5,6].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.