Action recognition in robotics is a research field that has gained momentum in recent years. In this work, a video activity recognition method is presented, which has the ultimate goal of endowing a robot with action recognition capabilities for a more natural social interaction. The application of Common Spatial Patterns (CSP), a signal processing approach widely used in electroencephalography (EEG), is presented in a novel manner to be used in activity recognition in videos taken by a humanoid robot. A sequence of skeleton data is considered as a multidimensional signal and filtered according to the CSP algorithm. Then, characteristics extracted from these filtered data are used as features for a classifier. A database with 46 individuals performing six different actions has been created to test the proposed method. The CSP-based method along with a Linear Discriminant Analysis (LDA) classifier has been compared to a Long Short-Term Memory (LSTM) neural network, showing that the former obtains similar or better results than the latter, while being simpler.
In this paper we report on the design of a pipeline involving Common Spatial Patterns (CSP), a signal processing approach commonly used in the field of electroencephalography (EEG), matrix representation of features and image classification to categorize videos taken by a humanoid robot. The ultimate goal is to endow the robot with action recognition capabilities for a more natural social interaction. Summarizing, we apply the CSP algorithm to a set of signals obtained for each video by extracting skeleton joints of the person performing the action. From the transformed signals a summary image is obtained for each video, and these images are then classified using two different approaches; global visual descriptors and convolutional neural networks. The presented approach has been tested on two data sets that represent two scenarios with common characteristics. The first one is a data set with 46 individuals performing 6 different actions. In order to create the group of signals of each video, OpenPose has been used to extract the skeleton joints of the person performing the actions. The second data set is an Argentinian Sign Language data set (LSA64) from which the signs performed using just the right hand have been used. In this case the joint signals have been obtained using MediaPipe. The results obtained with the presented method have been compared with a Long Short-Term Memory (LSTM) method, achieving promising results.
Music genre classification is a challenging research concept, for which open questions remain regarding classification approach, music piece representation, distances between/within genres, and so on. In this paper an investigation on the classification of generated music pieces is performed, based on the idea that grouping close related known pieces in different sets –or clusters– and then generating in an automatic way a new song which is somehow “inspired” in each set, the new song would be more likely to be classified as belonging to the set which inspired it, based on the same distance used to separate the clusters. Different music pieces representations and distances among pieces are used; obtained results are promising, and indicate the appropriateness of the used approach even in a such a subjective area as music genre classification is.
This paper presents a music generation method which is an extension of a previously presented method that generates coherent melodies using a melodic coherence structure extracted from a template piece. This extension, which has been applied for generating bertso melodies, adds the generation of the rhythmic content of the melodies, for which a rhythmic coherence structure of the template piece is also created. To do so, a pattern discovery and ranking method is used to discover the rhythmically repeated segments that are interesting, and create a rhythmic coherence structure which can have several levels of nesting. Independent sampling processes have been developed for melodic and rhythmic content, using an adapted optimization method for sampling the rhythmic content of the new pieces. An evaluation process has been carried out to evaluate some of the generated pieces, considering on one hand how the listeners perceive them and on the other hand whether they share the features with bertso melodies. It has been concluded from this evaluation that the method is capable of generating good coherent bertso melodies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.