Video Activity Recognition: State-of-the-Art

Pandit

et al. 2021

Sensors

In this paper, we design algorithms for indoor activity recognition and 3D thermal model generation using thermal images, RGB images, captured from external sensors, and the internet of things setup. Indoor activity recognition deals with two sub-problems: Human activity and household activity recognition. Household activity recognition includes the recognition of electrical appliances and their heat radiation with the help of thermal images. A FLIR ONE PRO camera is used to capture RGB-thermal image pairs for a scene. Duration and pattern of activities are also determined using an iterative algorithm, to explore kitchen safety situations. For more accurate monitoring of hazardous events such as stove gas leakage, a 3D reconstruction approach is proposed to determine the temperature of all points in the 3D space of a scene. The 3D thermal model is obtained using the stereo RGB and thermal images for a particular scene. Accurate results are observed for activity detection, and a significant improvement in the temperature estimation is recorded in the 3D thermal model compared to the 2D thermal image. Results from this research can find applications in home automation, heat automation in smart homes, and energy management in residential spaces.

Section: Resultsmentioning

confidence: 99%

Activity Recognition in Residential Spaces with Internet of Things Devices and Thermal Imaging

Pandit

et al. 2021

Sensors

“…But only a few researches were found in extraction high-level semantic information in sport video analysis. Despite astonishing performance of deep learning based archicture, the advancement achieves in image classification have not been reached in certain field like video classification or sport video analysis [49]. It is still an open issue in deep learning-based research in which many researchers try to solve and it is an ongoing research work [50].…”

Section: Discussionmentioning

confidence: 99%

Deep learning in sport video analysis: a review

et al. 2020

Sport is a competitive field, where it is an element of measurement for a countries development. Due to this reason, sport analysis has become one of the major contribution in analysing and improving the performance level of an athlete. Video-based modality has become a crucial tool used in sport analysis by coaches and performance analysis. There were wide variety of techniques used in sport video analysis. The main purpose of this review paper is to compare and update review between traditional handcrafted approach and deep learning approach in sport video analysis based on human activity recognition, overview of recent study in video based human activity recognition in sport analysis and finally concluded with future potential direction in sport video analysis.

“…For the latter we propose the Histograms of Gradients in Salience (HOG-S ) features extracted from the anonymity domain, i.e., the temporal visual salience map sequence as presented in Section III-B. It must be also noted that many traditional HAR methods [48] begin with temporal shot segmentation [49]- [51]. However, our proposed method detailed in this paper mainly focuses on action recognition from temporal salience maps from a given temporal window of video frames.…”

Section: The Proposed Methodsmentioning

confidence: 99%

Modeling Temporal Visual Salience for Human Action Recognition Enabled Visual Anonymity Preservation

2020

This paper proposes a novel approach for visually anonymizing video clips while retaining the ability to machine-based analysis of the video clip, such as, human action recognition. The visual anonymization is achieved by proposing a novel method for generating the anonymization silhouette by modelling the frame-wise temporal visual salience. This is followed by analysing these temporal saliencebased silhouettes by extracting the proposed histograms of gradients in salience (HOG-S) for learning the action representation in the visually anonymized domain. Since the anonymization maps are based on the temporal salience maps represented in gray scale, only the moving body parts related to the motion of the action are represented in larger gray values forming highly anonymized silhouettes, resulting in the highest mean anonymity score (MAS), the least identifiable visual appearance attributes and a high utility of human-perceived utility in action recognition. In terms of machine-based human action recognition, using the proposed HOGS features has resulted in the highest accuracy rate in the anonymized domain compared to those achieved from the existing anonymization methods. Overall, the proposed holistic human action recognition method, i.e., the temporal salience modelling followed by the HOGS feature extraction, has resulted in the best human action recognition accuracy rates for datasets DHA, KTH, UIUC1, UCF Sports and HMDB51 with improvements of 3%, 1.6%, 0.8%, 1.3% and 16.7%, respectively. The proposed method outperforms both feature-based and deep learning based existing approaches. INDEX TERMS Visual anonymization, Human action recognition, histogram of gradients in salience (HOG-S), temporal visual salience estimation, privacy, video-based monitoring, assisted living.