In this paper, we present a novel descriptor for human action recognition, called Motion of Oriented Magnitudes Patterns (MOMP), which considers the relationships between the local gradient distributions of neighboring patches coming from successive frames in video. The proposed descriptor also characterizes the information changing across different orientations, is therefore very discriminative and robust. The major advantages of MOMP are its very fast computation time and simple implementation. Subsequently, our features are combined with an effective coding scheme VLAD (Vector of locally aggregated descriptors) in the feature representation step, and a SVM (Support Vector Machine) classifier in order to better represent and classify the actions. By experimenting on several common benchmarks, we obtain the state-of-the-art results on the KTH dataset as well as the performance comparable to the literature on the UCF Sport dataset.
In this paper, we propose an integrated approach to robot vision: a key frame-based skeleton feature estimation and action recognition network (KFSENet) that incorporates action recognition with face and emotion recognition to enable social robots to engage in more personal interactions. Instead of extracting the human skeleton features from the entire video, we propose a key frame-based approach for their extraction using pose estimation models. We select the key frames using the gradient of a proposed total motion metric that is computed using dense optical flow. We use the extracted human skeleton features from the selected key frames to train a deep neural network (i.e., the double-feature double-motion network (DDNet)) for action recognition. The proposed KFSENet utilizes a simpler model to learn and differentiate between the different action classes, is computationally simpler and yields better action recognition performance when compared with existing methods. The use of key frames allows the proposed method to eliminate unnecessary and redundant information, which improves its classification accuracy and decreases its computational cost. The proposed method is tested on both publicly available standard benchmark datasets and self-collected datasets. The performance of the proposed method is compared to existing state-of-the-art methods. Our results indicate that the proposed method yields better performance compared with existing methods. Moreover, our proposed framework integrates face and emotion recognition to enable social robots to engage in more personal interaction with humans.
Here, the authors introduce a novel system which incorporates the discriminative motion of oriented magnitude patterns (MOMP) descriptor into simple yet efficient techniques. The authors' descriptor both investigates the relations of the local gradient distributions in neighbours among consecutive image sequences and characterises information changing across different orientations. The proposed system has two main contributions: (i) the authors adopt feature post-processing principal component analysis followed by vector of locally aggregated descriptors encoding to de-correlate MOMP descriptor and reduce the dimension in order to speed up the algorithm; (ii) then the authors include the feature selection (i.e. statistical dependency, mutual information, and minimal redundancy maximal relevance) to find out the best feature subset to improve the performance and decrease the computational expense in classification through support vector machine techniques. Experiment results on four data sets, Weizmann (98.4%), KTH (96.3%), UCF Sport (82.0%), and HMDB51 (31.5%), prove the efficiency of the authors' algorithm.
Meteorological and hydrological chart records information, measurement data of rainfall, water level, humidity, temperature and other types of measured parameters. These parameters are collected from hydrometeorological measurement stations nationwide. The storage of this information is extremely important for the purpose of researching and forecasting weather and natural disasters in the future. However, at present, the storage of all types of schemas is in paper form, the reading of data depends on the expert. Therefore, it is difficult to guarantee the integrity of the data over time. In this paper, we propose a solution for schema recognition and self-recording of schema information using today's most advanced machine vision and artificial intelligence technologies to help store and digitize data, diagrams automatically. The solution integrates the page structure analysis algorithm, the grid detection algorithm and the alignment algorithm to combine the line detection algorithm and the objects in the schema to separate the line. By experiment, the solution has achieved high accuracy, more than 90% of the diagrams can be digitized, including all types of diagrams of precipitation, water level, humidity, pressure, and temperature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.