Abstract-Affordances encode relationships between actions, objects, and effects. They play an important role on basic cognitive capabilities such as prediction and planning. We address the problem of learning affordances through the interaction of a robot with the environment, a key step to understand the world properties and develop social skills. We present a general model for learning object affordances using Bayesian networks integrated within a general developmental architecture for social robots. Since learning is based on a probabilistic model, the approach is able to deal with uncertainty, redundancy, and irrelevant information. We demonstrate successful learning in the real world by having an humanoid robot interacting with objects. We illustrate the benefits of the acquired knowledge in imitation games.
Abstract-In the last few years, image classification has become an incredibly active research topic, with widespread applications. Most methods for visual recognition are fully supervised, as they make use of bounding boxes or pixelwise segmentations to locate objects of interest. However, this type of manual labeling is time consuming, error prone and it has been shown that manual segmentations are not necessarily the optimal spatial enclosure for object classifiers. This paper proposes a weakly-supervised system for multi-label image classification. In this setting, training images are annotated with a set of keywords describing their contents, but the visual concepts are not explicitly segmented in the images.We formulate the weakly-supervised image classification as a low-rank matrix completion problem. Compared to previous work, our proposed framework has three advantages: (1) Unlike existing solutions based on multiple-instance learning methods, our model is convex. We propose two alternative algorithms for matrix completion specifically tailored to visual data, and prove their convergence. (2) Unlike existing discriminative methods, our algorithm is robust to labeling errors, background noise and partial occlusions. (3) Our method can potentially be used for semantic segmentation. Experimental validation on several datasets shows that our method outperforms state-of-the-art classification algorithms, while effectively capturing each class appearance.
Abstract-This work presents a multimodal bottom-up attention system for the humanoid robot iCub where the robot's decisions to move eyes and neck are based on visual and acoustic saliency maps. We introduce a modular and distributed software architecture which is capable of fusing visual and acoustic saliency maps into one egocentric frame of reference. This system endows the iCub with an emergent exploratory behavior reacting to combined visual and auditory saliency. The developed software modules provide a flexible foundation for the open iCub platform and for further experiments and developments, including higher levels of attention and representation of the peripersonal space.
The concept of affordances appeared in psychology during the late 60's as an alternative perspective on the visual perception of the environment. It was revolutionary in the intuition that the way living beings perceive the world is deeply influenced by the actions they are able to perform. Then, across the last 40 years, it has influenced many applied fields: e.g. design, human-computer interaction, computer vision, robotics. In this paper we offer a multidisciplinary perspective on the notion of affordances: we first discuss the main definitions and formalizations of the affordance theory, then we report the most significant evidence in psychology and neuroscience that support it, and finally we review the most relevant applications of this concept in robotics.
Abstract-We propose an approach for vision-based navigation of underwater robots that relies on the use video mosaics of the sea bottom as environmental representations for navigation.We present a methodology for building high quality video mosaics of the sea bottom, in a fully automatic manner, that ensures global spatial coherency. During navigation, a set of efficient visual routines are used for the fast and accurate localization of the underwater vehicle with respect to the mosaic. These visual routines were developed taking into account the operating requirements of real-time position sensing, error bounding and computational load.A visual servoing controller, based on the vehicle kinematics, is used to drive the vehicle along a computed trajectory, specified in the mosaic, while maintaining constant altitude. The trajectory towards a goal point is generated online to avoid undefined areas in the mosaic.We have conducted a large set of sea trials, under realistic operating conditions. This paper demonstrates that, without resorting to additional sensors, visual information can be used to create environment representations of the sea bottom (mosaics) and support long runs of navigation in a robust manner.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.