HandSOM - neural clustering of hand motion for gesture recognition in real time

Parisi, German I.; Jirak, Doreen; Wermter, Stefan

doi:10.1109/roman.2014.6926380

Cited by 7 publications

(4 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, the reported results motivate the embedding of our learning system into mobile robot platforms to conduct further evaluations in more complex scenarios, where the robust recognition of actions plays a key role. For instance, the visual detection of dangerous events for assistive robotics such as fall events (Parisi and Wermter, 2013 , 2015 ), and the recognition of actions with learning robots in HRI scenarios (Soltoggio et al, 2013a , b ; Barros et al, 2014 ; Parisi et al, 2014a , b ).…”

Section: Discussionmentioning

confidence: 99%

“…Learning systems using depth information from low-cost sensors are increasingly popular in the research community encouraged by the combination of computational efficiency and robustness to light changes in indoor environments. In recent years, a large number of applications using 3D motion information has been proposed for human activity recognition such as classification of full-body actions (Faria et al, 2014 ; Shan and Akella, 2014 ; Parisi et al, 2014c ), fall detection (Rougier et al, 2011 ; Mastorakis and Makris, 2012 ; Parisi and Wermter, 2013 ), and recognition of hand gestures (Suarez and Murphy, 2012 ; Parisi et al, 2014a , b ; Yanik et al, 2014 ). A vast number of depth-based methods has used a 3D human skeleton model to extract relevant action features for the subsequent use of a classification algorithm.…”

Section: Recognition Of Human Actionsmentioning

confidence: 99%

See 1 more Smart Citation

Self-organizing neural integration of pose-motion features for human action recognition

2015

Self Cite

View full text Add to dashboard Cite

The visual recognition of complex, articulated human movements is fundamental for a wide range of artificial systems oriented toward human-robot communication, action classification, and action-driven perception. These challenging tasks may generally involve the processing of a huge amount of visual information and learning-based mechanisms for generalizing a set of training actions and classifying new samples. To operate in natural environments, a crucial property is the efficient and robust recognition of actions, also under noisy conditions caused by, for instance, systematic sensor errors and temporarily occluded persons. Studies of the mammalian visual system and its outperforming ability to process biological motion information suggest separate neural pathways for the distinct processing of pose and motion features at multiple levels and the subsequent integration of these visual cues for action perception. We present a neurobiologically-motivated approach to achieve noise-tolerant action recognition in real time. Our model consists of self-organizing Growing When Required (GWR) networks that obtain progressively generalized representations of sensory inputs and learn inherent spatio-temporal dependencies. During the training, the GWR networks dynamically change their topological structure to better match the input space. We first extract pose and motion features from video sequences and then cluster actions in terms of prototypical pose-motion trajectories. Multi-cue trajectories from matching action frames are subsequently combined to provide action dynamics in the joint feature space. Reported experiments show that our approach outperforms previous results on a dataset of full-body actions captured with a depth sensor, and ranks among the best results for a public benchmark of domestic daily actions.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Recognition Of Human Actionsmentioning

confidence: 99%

Self-organizing neural integration of pose-motion features for human action recognition

2015

Self Cite

View full text Add to dashboard Cite

show abstract

“…The proposed architectures can be considered a further step towards more flexible neural network models for learning robust visual representations on the basis of visual experience. Successful applications of deep neural network self-organization include human action recognition (Parisi, Weber & Wermter 2014, Elfaramawy et al 2017, gesture recognition (Parisi, Barros & Wermter 2014, Parisi, Jirak & Wermter 2014, body motion assessment (Parisi, von Stosch, Magg & Wermter 2015, Parisi, Magg & Wermter 2016, humanobject interaction (Mici et al 2017(Mici et al , 2018, continual learning (Parisi et al 2017, Parisi, Tani, Weber & Wermter 2018, and audio-visual integration (Parisi, Tani, Weber & Wermter 2016). Models of hierarchical action learning are typically feedforward.…”

Section: Conclusion and Open Challengesmentioning

confidence: 99%

Human Action Recognition and Assessment via Deep Neural Network Self-Organization

Parisi¹

2020

Preprint

Self Cite

View full text Add to dashboard Cite

The robust recognition and assessment of human actions are crucial in human-robot interaction (HRI) domains. While state-of-the-art models of action perception show remarkable results in large-scale action datasets, they mostly lack the flexibility, robustness, and scalability needed to operate in natural HRI scenarios which require the continuous acquisition of sensory information as well as the classification or assessment of human body patterns in real time. In this chapter, I introduce a set of hierarchical models for the learning and recognition of actions from depth maps and RGB images through the use of neural network selforganization. A particularity of these models is the use of growing self-organizing networks that quickly adapt to non-stationary distributions and implement dedicated mechanisms for continual learning from temporally correlated input.

show abstract

“…For recognizing gestures, we used an extended version of neural network learning for gesture recognition [25] that extracts hand-independent gesture features from depth map sequences. The learning model consists of a set of two hierarchically arranged self-organizing networks that learn the spatiotemporal structure of the input sequences in terms of gesture features.…”

Section: A Speech and Gesture Recognitionmentioning

confidence: 99%

Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning

Cruz

Parisi

Wermter

2018

2018 International Joint Conference on Neural Networks (IJCNN)

Self Cite

View full text Add to dashboard Cite

Interactive reinforcement learning (IRL) extends traditional reinforcement learning (RL) by allowing an agent to interact with parent-like trainers during a task. In this paper, we present an IRL approach using dynamic audio-visual input in terms of vocal commands and hand gestures as feedback. Our architecture integrates multi-modal information to provide robust commands from multiple sensory cues along with a confidence value indicating the trustworthiness of the feedback. The integration process also considers the case in which the two modalities convey incongruent information. Additionally, we modulate the influence of sensory-driven feedback in the IRL task using goal-oriented knowledge in terms of contextual affordances. We implement a neural network architecture to predict the effect of performed actions with different objects to avoid failed-states, i.e., states from which it is not possible to accomplish the task. In our experimental setup, we explore the interplay of multimodal feedback and task-specific affordances in a robot cleaning scenario. We compare the learning performance of the agent under four different conditions: traditional RL, multi-modal IRL, and each of these two setups with the use of contextual affordances. Our experiments show that the best performance is obtained by using audio-visual feedback with affordancemodulated IRL. The obtained results demonstrate the importance of multi-modal sensory processing integrated with goal-oriented knowledge in IRL tasks.

show abstract

HandSOM - neural clustering of hand motion for gesture recognition in real time

Cited by 7 publications

References 17 publications

Self-organizing neural integration of pose-motion features for human action recognition

Self-organizing neural integration of pose-motion features for human action recognition

Human Action Recognition and Assessment via Deep Neural Network Self-Organization

Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning

Contact Info

Product

Resources

About