Hai-Hong Phan scite author profile

In this paper, we present a novel descriptor for human action recognition, called Motion of Oriented Magnitudes Patterns (MOMP), which considers the relationships between the local gradient distributions of neighboring patches coming from successive frames in video. The proposed descriptor also characterizes the information changing across different orientations, is therefore very discriminative and robust. The major advantages of MOMP are its very fast computation time and simple implementation. Subsequently, our features are combined with an effective coding scheme VLAD (Vector of locally aggregated descriptors) in the feature representation step, and a SVM (Support Vector Machine) classifier in order to better represent and classify the actions. By experimenting on several common benchmarks, we obtain the state-of-the-art results on the KTH dataset as well as the performance comparable to the literature on the UCF Sport dataset.

Information theory based pruning for CNN compression and its application to image classification and action recognition

2019

KFSENet: A Key Frame-Based Skeleton Feature Estimation and Action Recognition Network for Improved Robot Vision with Face and Emotion Recognition

et al. 2022

Applied Sciences

In this paper, we propose an integrated approach to robot vision: a key frame-based skeleton feature estimation and action recognition network (KFSENet) that incorporates action recognition with face and emotion recognition to enable social robots to engage in more personal interactions. Instead of extracting the human skeleton features from the entire video, we propose a key frame-based approach for their extraction using pose estimation models. We select the key frames using the gradient of a proposed total motion metric that is computed using dense optical flow. We use the extracted human skeleton features from the selected key frames to train a deep neural network (i.e., the double-feature double-motion network (DDNet)) for action recognition. The proposed KFSENet utilizes a simpler model to learn and differentiate between the different action classes, is computationally simpler and yields better action recognition performance when compared with existing methods. The use of key frames allows the proposed method to eliminate unnecessary and redundant information, which improves its classification accuracy and decreases its computational cost. The proposed method is tested on both publicly available standard benchmark datasets and self-collected datasets. The performance of the proposed method is compared to existing state-of-the-art methods. Our results indicate that the proposed method yields better performance compared with existing methods. Moreover, our proposed framework integrates face and emotion recognition to enable social robots to engage in more personal interaction with humans.

Action recognition based on motion of oriented magnitude patterns and feature selection

Nguyen

et al. 2018

IET Computer Vision

Here, the authors introduce a novel system which incorporates the discriminative motion of oriented magnitude patterns (MOMP) descriptor into simple yet efficient techniques. The authors' descriptor both investigates the relations of the local gradient distributions in neighbours among consecutive image sequences and characterises information changing across different orientations. The proposed system has two main contributions: (i) the authors adopt feature post-processing principal component analysis followed by vector of locally aggregated descriptors encoding to de-correlate MOMP descriptor and reduce the dimension in order to speed up the algorithm; (ii) then the authors include the feature selection (i.e. statistical dependency, mutual information, and minimal redundancy maximal relevance) to find out the best feature subset to improve the performance and decrease the computational expense in classification through support vector machine techniques. Experiment results on four data sets, Weizmann (98.4%), KTH (96.3%), UCF Sport (82.0%), and HMDB51 (31.5%), prove the efficiency of the authors' algorithm.

LBP-and-ScatNet-based combined features for efficient texture classification

Nguyen

et al. 2017

Multimed Tools Appl

Multiple Imputation by Generative Adversarial Networks for Classification with Incomplete Data

Ngoc

Nguyen²,

Tran

et al. 2021

Xây dựng giải pháp công nghệ nhận dạng giản đồ Khí tượng thủy văn

Phương¹,

Hưng²,

Huy³

et al. 2021

VNJHM

Meteorological and hydrological chart records information, measurement data of rainfall, water level, humidity, temperature and other types of measured parameters. These parameters are collected from hydrometeorological measurement stations nationwide. The storage of this information is extremely important for the purpose of researching and forecasting weather and natural disasters in the future. However, at present, the storage of all types of schemas is in paper form, the reading of data depends on the expert. Therefore, it is difficult to guarantee the integrity of the data over time. In this paper, we propose a solution for schema recognition and self-recording of schema information using today's most advanced machine vision and artificial intelligence technologies to help store and digitize data, diagrams automatically. The solution integrates the page structure analysis algorithm, the grid detection algorithm and the alignment algorithm to combine the line detection algorithm and the objects in the schema to separate the line. By experiment, the solution has achieved high accuracy, more than 90% of the diagrams can be digitized, including all types of diagrams of precipitation, water level, humidity, pressure, and temperature.

Nghiên cứu ứng dụng thuật toán nhận dạng cấu trúc bảng dựa trên phát hiện đối tượng

Dương¹,

Phan²,

Phương³

2021

VNJHM