Abstract-Social activity based on body motion is a key feature for non-verbal and physical behavior defined as function for communicative signal and social interaction between individuals. Social activity recognition is important to study human-human communication and also human-robot interaction. Based on that, this research has threefold goals: (1) recognition of social behavior (e.g. human-human interaction) using a probabilistic approach that merges spatio-temporal features from individual bodies and social features from the relationship between two individuals; (2) learn priors based on physical proximity between individuals during an interaction using proxemics theory to feed a probabilistic ensemble of activity classifiers; and (3) provide a public dataset with RGB-D data of social daily activities including risk situations useful to test approaches for assisted living, since this type of dataset is still missing. Results show that using the proposed approach designed to merge features with different semantics and proximity priors improves the classification performance in terms of precision, recall and accuracy when compared with other approaches that employ alternative strategies.
Abstract-We present a system for temporal detection of social interactions. Many of the works until now have succeeded in recognising activities from clipped videos in datasets, but for robotic applications, it is important to be able to move to more realistic data. For this reason, the proposed approach temporally detects intervals where individual or social activity is occurring. Recognition of human activities is a key feature for analysing the human behaviour. In particular, recognition of social activities is useful to trigger human-robot interactions or to detect situations of potential danger. Based on that, this research has three goals: (1) define a new set of descriptors, which are able to characterise human interactions; (2) develop a computational model to segment temporal intervals with social interaction or individual behaviour; (3) provide a public dataset with RGB-D data with continuous stream of individual activities and social interactions. Results show that the proposed approach attained relevant performance with temporal segmentation of social activities.
The automatic detection of anomalies in Active and Assisted Living (AAL) environments is important for monitoring the wellbeing and safety of the elderly at home. The integration of smart domotic sensors (e.g. presence detectors) and those ones equipping modern mobile robots (e.g. RGB-D cameras) provides new opportunities for addressing this challenge. In this paper, we propose a novel solution to combine local activity levels detected by a single RGB-D camera with the global activity perceived by a network of domotic sensors. Our approach relies on a new method for computing such a global activity using various presence detectors, based on the concept of entropy from information theory. This entropy effectively shows how active a particular room or environment's area is. The solution includes also a new application of Hybrid Markov Logic Networks (HMLNs) to merge different information sources for local and global anomaly detection. The system has been tested with a comprehensive dataset of RGB-D and domotic data containing data entries from 37 different domotic sensors (presence, temperature, light, energy consumption, door contact), which is made publicly available. The experimental results show the effectiveness of our approach and its potential for complex anomaly detection in AAL settings.
Modern service robots are provided with one or more sensors, often including RGB-D cameras, to perceive objects and humans in the environment. This paper proposes a new system for the recognition of human social activities from a continuous stream of RGB-D data. Many of the works until now have succeeded in recognising activities from clipped videos in datasets, but for robotic applications it is important to be able to move to more realistic scenarios in which such activities are not manually selected. For this reason, it is useful to detect the time intervals when humans are performing social activities, the recognition of which can contribute to trigger human-robot interactions or to detect situations of potential danger. The main contributions of this research work include a novel system for the recognition of social activities from continuous RGB-D data, combining temporal segmentation and classification, as well as a model for learning the proximity-based priors of the social activities. A new public dataset with RGB-D videos of social and individual activities is also provided and used for evaluating the proposed solutions. The results show the good performance of the system in recognising social activities from continuous RGB-D data.
Grasp stability prediction of unknown objects is crucial to enable autonomous robotic manipulation in an unstructured environment. Even if prior information about the object is available, real-time local exploration might be necessary to mitigate object modelling inaccuracies. This paper presents an approach to predict safe grasps of unknown objects using depth vision and a dexterous robot hand equipped with tactile feedback. Our approach does not assume any prior knowledge about the objects. First, an object pose estimation is obtained from RGB-D sensing; then, the object is explored haptically to maximise a given grasp metric. We compare two probabilistic methods (i.e. standard and unscented Bayesian Optimisation) against random exploration (i.e. uniform grid search). Our experimental results demonstrate that these probabilistic methods can provide confident predictions after a limited number of exploratory observations, and that unscented Bayesian Optimisation can find safer grasps, taking into account the uncertainty in robot sensing and grasp execution.
The contactless estimation of the weight of a container and the amount of its content manipulated by a person are key pre-requisites for safe human-to-robot handovers. However, opaqueness and transparencies of the container and the content, and variability of materials, shapes, and sizes, make this problem challenging. In this paper, we present a range of methods and an open framework to benchmark acoustic and visual perception for the estimation of the capacity of a container, and the type, mass, and amount of its content. The framework includes a dataset, specific tasks and performance measures. We conduct a fair and in-depth comparative analysis of methods that used this framework and audio-only or vision-only baselines designed from related works. Based on this analysis, we can conclude that audioonly and audio-visual classifiers are suitable for the estimation of the type and amount of the content using different types of convolutional neural networks, combined with either recurrent neural networks or a majority voting strategy, whereas computer vision methods are suitable to determine the capacity of the container using regression and geometric approaches. Classifying the content type and level using only audio achieves a weighted average F1-score up to 81% and 97%, respectively. Estimating the container capacity with vision-only approaches and filling mass with audio-visual approaches, multi-stage algorithms reaches up to 65% weighted average capacity and mass scores. These results show that there is still room of improvement for the design of future methods that will be ranked and compared on the individual leaderboards provided by our open framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.