This paper describes a novel approach for autonomous and incremental learning of motion pattern primitives by observation of human motion. Human motion patterns are abstracted into a dynamic stochastic model, which can be used for both subsequent motion recognition and generation, analogous to the mirror neuron hypothesis in primates. The model size is adaptable based on the discrimination requirements in the associated region of the current knowledge base. A new algorithm for sequentially training the Markov chains is developed, to reduce the computation cost during model adaptation. As new motion patterns are observed, they are incrementally grouped together using hierarchical agglomerative clustering based on their relative distance in the model space. The clustering algorithm forms a tree structure, with specialized motions at the tree leaves, and generalized motions closer to the root. The generated tree structure will depend on the type of training data provided, so that the most specialized motions will be those for which the most training has been received. Tests with motion capture data for a variety of motion primitives demonstrate the efficacy of the algorithm.
This paper describes a novel approach to linguistic mutual inference, which enables robots not only to linguistically interpret the motion patterns in the form of sentences but also to generate the motions from the sentences. The inference can be established based on two modules, the motion language model and the natural language model. The motion language model stochastically represents an association structure between symbols of motion patterns and the words in sentences assigned to the motion. This is a statistical model with a three layered structure of motion symbols, latent states and words. The natural language model statistically represents a structure of sentences based on word bigrams. The motion language model and the natural language model correspond to semantics and syntax respectively. An approach to the integration of motion language model with the natural language model allows the linguistic mutual inference for the robots. The two kinds of inference can be made by solving search problems, search for a sequence of words corresponding to a motion and search for a symbol of motion pattern corresponding to a sentence. The proposed approach to interpretation of motion patterns as sentences and generation of motion patterns from the sentences through the integration of motion language model with the natural language model is validated by an experiment on the human behavioral data.
Abstract-This paper describes a novel approach for incremental learning of human motion pattern primitives through on-line observation of human motion. The observed motion time series data stream is first stochastically segmented into potential motion primitive segments, based on the assumption that data belonging to the same motion primitive will have the same underlying distribution. The motion segments are then abstracted into a stochastic model representation, and automatically clustered and organized. As new motion patterns are observed, they are incrementally grouped together based on their relative distance in the model space. The resulting representation of the knowledge domain is a tree structure, with specialized motions at the tree leaves, and generalized motions closer to the root. The tree leaves, which represent the most specialized learned motion primitives, are then passed back to the segmentation algorithm, so that as the number of known motion primitives increases, the accuracy of the segmentation can also be improved. The combined algorithm is tested on a sequence of continuous human motion data obtained through motion capture, and demonstrates the performance of the proposed approach.
This paper describes a novel algorithm for autonomous and incremental learning of motion pattern primitives by observation of human motion. Human motion patterns are abstracted into a Hidden Markov Model representation, which can be used for both subsequent motion recognition and generation, analogous to the mirror neuron hypothesis in primates.As new motion patterns are observed, they are incrementally grouped together using hierarchical agglomerative clustering based on their relative distance in the HMM space. The clustering algorithm forms a tree structure, with specialized motions at the tree leaves, and generalized motions closer to the root. The generated tree structure will depend on the type of training data provided, so that the most specialized motions will be those for which the most training has been received. Tests with motion capture data for a variety of motion primitives demonstrate the efficacy of the algorithm.
In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations. This multiscale architecture enables the model to learn both local and global feature representations, which are critical for 3D human pose estimation. We also introduce a multi-level feature learning approach using different-depth intermediate features and show the performance improvements that result from exploiting multi-scale, multi-level feature representations. Extensive experiments are conducted to validate our approach, and the results show that our model outperforms the state-of-the-art.
Communication skill is essential for social robots in various environments such as homes, offices, and hospitals, where the robots are expected to interact with humans. In this paper, we model the primitive nonverbal communication between two persons by mimetic communication model. The model consists of three groups of Hidden Markov Models (HMMs) hierarchically combined to recognize motions of the human and to generate the interactive motions of the robot. HMMs in the lower layer abstract the motion patterns and HMMs in the upper layer represent the interaction patterns. We demonstrate the validity of this model through kick boxing match between a motion-captured human and humanoid robot, where the robot can autonomously generate its motion in response to attacks by the human.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.