Abstract-This paper deals with the problem of object reconstruction for visual search by a humanoid robot. Three problems necessary to achieve the behavior autonomously are considered: full-body motion generation according to a camera pose, general object representation for visual recognition and pose estimation, and far-away visual detection of an object. First we deal with the problem of generating full body motion for a HRP-2 humanoid robot to achieve camera pose given by a Next Best View algorithm. We use an optimization based approach including self-collision avoidance. This is made possible by a body to body distance function having a continuous gradient. The second problem has received a lot of attention for several decades, and we present a solution based on 3D vision together with SIFTs descriptor, making use of the information available from the robot. It is shown in this paper that one of the major limitation of this model is the perception distance. Thus a new approach based on a generative object model is presented to cope with more difficult situations. It relies on a local representation which allows handling occlusion as well as large scale and pose variations.
This paper presents a framework for a visual search behavior of a 3D object in a 3D environment performed by HRP-2 humanoid robot. Object search is a sensor planning problem in which appropriate sensor configuration must be selected in order to allow a proper recognition. The sensor planning is formulated as an optimization problem in which the goal is to maximize the target detection probability while minimizing the energy/distance and time to achieve the task. The knowledge of the target position is encoded in a probability distribution map which will be updated after each sensing operation. Specificities of the humanoid robot as well as the characteristics of the recognition system are taken into account to restrict the sensor configuration space to meet heavy time constraints on the behavior in order to have a reactive and near human like reaction time.
Aiming at building versatile humanoid systems, we present in this paper the real-time implementation of behaviors which integrate walking and vision to achieve general functionalities. This paper describes how real-time -or high bandwidth-cognitive processes can be obtained by combining vision with walking. The central point of our methodology is to use appropriate models to reduce the complexity of the search space. We will describe the models introduced in the different blocks of the system and their relationships: walking pattern, Self Localization and Map Building, real-time reactive vision behaviors, and planning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.