In this paper we report on a recent public experiment that shows two robots making pancakes using web instructions. In the experiment, the robots retrieve instructions for making pancakes from the World Wide Web and generate robot action plans from the instructions. This task is jointly performed by two autonomous robots: The first robot opens and closes cupboards and drawers, takes a pancake mix from the refrigerator, and hands it to the robot B. The second robot cooks and flips the pancakes, and then delivers them back to the first robot. While the robot plans in the scenario are all percept-guided, they are also limited in different ways and rely on manually implemented sub-plans for parts of the task. We will thus discuss the potential of the underlying technologies as well as the research challenges raised by the experiment.
In this paper we present a method for building complete models for grasping from a single 3D snapshot of a scene composed of objects of daily use in human living environments. We employ fast shape estimation, probabilistic model fitting and verification methods capable of dealing with different kinds of symmetries, and combine these with a triangular mesh of the parts that have no other representation to model previously unseen objects of arbitrary shape. Our approach is enhanced by the information given by the geometric clues about different parts of objects which serve as prior information for the selection of the appropriate reconstruction method. While we designed our system for grasping based on single view 3D data, its generality allows us to also use the combination of multiple views. We present two application scenarios that require complete geometric models: grasp planning and locating objects in camera images.
The goal of the paper is to present the foreseen research activity of the European project "SHERPA" whose activities will start officially on February 1th 2013. The goal of SHERPA is to develop a mixed ground and aerial robotic platform to support search and rescue activities in a real-world hostile environment, like the alpine scenario that is specifically targeted in the project. Looking into the technological platform and the alpine rescuing scenario, we plan to address a number of research topics about cognition and control. What makes the project potentially very rich from a scientific viewpoint is the heterogeneity and the capabilities to be owned by the different actors of the SHERPA system: the human rescuer is the "busy genius", working in team with the ground vehicle, as the "intelligent donkey", and with the aerial platforms, i.e. the "trained wasps" and "patrolling hawks". Indeed, the research activity focuses on how the "busy genius" and the "SHERPA animals" interact and collaborate with each other, with their own features and capabilities, toward the achievement of a common goal.
Abstract-In this article we investigate the representation and acquisition of Semantic Objects Maps (SOMs) that can serve as information resources for autonomous service robots performing everyday manipulation tasks in kitchen environments. These maps provide the robot with information about its operation environment that enable it to perform fetch and place tasks more efficiently and reliably. To this end, the semantic object maps can answer queries such as the following ones: "What do parts of the kitchen look like?", "How can a container be opened and closed?", "Where do objects of daily use belong?", "What is inside of cupboards/drawers?", etc.The semantic object maps presented in this article, which we call SOM + , extend the first generation of SOMs presented by Rusu et al. [1] in that the representation of SOM + is designed more thoroughly and that SOM + also include knowledge about the appearance and articulation of furniture objects. Also, the acquisition methods for SOM + substantially advance those developed in [1] in that SOM + are acquired autonomously and with low-cost (Kinect) instead of very accurate (laser-based) 3D sensors. In addition, perception methods are more general and are demonstrated to work in different kitchen environments. I. INTRODUCTIONRobots that do not know where objects are have to search for them. Robots that do not know how objects look have to guess whether they have fetched the right one. Robots that do not know the articulation models of drawers and cupboards have to open them very carefully in order to not damage them. Thus, robots should store and maintain knowledge about their environment that enables them to perform their tasks more reliably and efficiently. We call the collection of this knowledge the robot's maps and consider maps to be models of the robot's operation environment that serve as information resources for better task performance. Robots build environment maps for many purposes. Most robot maps so far have been proposed for navigation. Robot maps for navigation enable robots to estimate their position in the environment, to check the reachability of the destination and to compute navigation plans. Depending on their purpose maps have to store different kinds of information in different forms. Maps might represent the occupancy of environment of 2D or 3D grid cells, they might contain landmarks or represent the topological structure of the environment. The maps might model objects of daily use, indoor, outdoor, underwater, extraterrestrial surfaces, and aerial environments.
Humanoid robotic assistants need capable and comprehensive perception systems that enable them to perform complex manipulation and grasping tasks. This requires the identification and recognition of supporting planes and objects in the world, together with their precise 6D poses. In this paper, we propose a 3D perception system architecture that can robustly fit CAD models in cluttered table setting scenes for the purpose of grasping with a mobile manipulator. Our approach uses a powerful combination of two different camera technologies, Time-Of-Flight (TOF) and RGB, to robustly segment the scene and extract object clusters. Using an a-priori database of object models we then perform a CAD matching in 2D camera images. We validate the proposed system in a number of experiments, and compare the system's performance and reliability with similar initiatives.
This article describes an object perception system for autonomous robots performing everyday manipulation tasks in kitchen environments. The perception system gains its strengths by exploiting that the robots are to perform the same kinds of tasks with the same objects over and over again. It does so by learning the object representations necessary for the recognition and reconstruction in the context of pick and place tasks.The system employs a library of specialized perception routines that solve different, well-defined perceptual sub-tasks and can be combined into composite perceptual activities including the construction of an object model database, multi-modal object classification, and object model reconstruction for grasping.We evaluate the effectiveness of our methods, and give examples of application scenarios using our personal robotic assistants acting in a human living environment.
Abstract-In this paper we present a comprehensive object categorization and classification system, of great importance for mobile manipulation applications in indoor environments. In detail, we tackle the problem of recognizing everyday objects that are useful for a personal robotic assistant in fulfilling its tasks, using a hierarchical multi-modal 3D-2D processing and classification system. The acquired 3D data is used to estimate geometric labels (plane, cylinder, edge, rim, sphere) at each voxel cell using the Radius-based Surface Descriptor (RSD). Then, we propose the use of a Global RSD feature (GRSD) to categorize point clusters that are geometrically identical into one of the object categories. Once a geometric category and a 3D position is obtained for each object cluster, we extract the region of interest in the camera image and compute a SURF-based feature vector for it. Thus we obtain the exact object instance and the orientation around the object's up-right axis from the appearance. The resultant system provides a hierarchical categorization of objects into basic classes from their geometry and identifies objects and their poses based on their appearance, with near real-time performance. We validate our approach on an extensive database of objects that we acquired using real sensing devices, and on both unseen views and unseen objects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.