We propose a benchmark for 6D pose estimation of a rigid object from a single RGB-D input image. The training data consists of a texture-mapped 3D object model or images of the object in known 6D poses. The benchmark comprises of: i) eight datasets in a unified format that cover different practical scenarios, including two new datasets focusing on varying lighting conditions, ii) an evaluation methodology with a pose-error function that deals with pose ambiguities, iii) a comprehensive evaluation of 15 diverse recent methods that captures the status quo of the field, and iv) an online evaluation system that is open for continuous submission of new results. The evaluation shows that methods based on point-pair features currently perform best, outperforming template matching methods, learning-based methods and methods based on 3D local features. The project website is available at bop.felk.cvut.cz.
We present a method for finding correspondence between 3D models. From an initial set of feature correspondences, our method uses a fast voting scheme to separate the inliers from the outliers. The novelty of our method lies in the use of a combination of local and global constraints to determine if a vote should be cast. On a local scale, we use simple, low-level geometric invariants. On a global scale, we apply covariant constraints for finding compatible correspondences. We guide the sampling for collecting voters by downward dependencies on previous voting stages. All of this together results in an accurate matching procedure. We evaluate our algorithm by controlled and comparative testing on different datasets, giving superior performance compared to state of the art methods. In a final experiment, we apply our method for 3D object detection, showing potential use of our method within higher-level vision.
If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information. About Emerald www.emeraldinsight.comEmerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online products and additional customer resources and services.Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation. AbstractPurpose -The purpose of this paper is to propose a new algorithm based on programming by demonstration and exception strategies to solve assembly tasks such as peg-in-hole. Design/methodology/approach -Data describing the demonstrated tasks are obtained by kinesthetic guiding. The demonstrated trajectories are transferred to new robot workspaces using three-dimensional (3D) vision. Noise introduced by vision when transferring the task to a new configuration could cause the execution to fail, but such problems are resolved through exception strategies. Findings -This paper demonstrated that the proposed approach combined with exception strategies outperforms traditional approaches for robot-based assembly. Experimental evaluation was carried out on Cranfield Benchmark, which constitutes a standardized assembly task in robotics. This paper also performed statistical evaluation based on experiments carried out on two different robotic platforms. Practical implications -The developed framework can have an important impact for robot assembly processes, which are among the most important applications of industrial robots. Our future plans involve implementation of our framework in a commercially available robot controller. Originality/value -This paper proposes a new approach to the robot assembly based on the Learning by Demonstration (LbD) paradigm. The proposed framework enables to quickly program new assembly tasks without the need for detailed analysis of the geometric and dynamic characteristics of workpieces involved in the assembly task. The algorithm provides an effective disturbance rejection, improved stability and increased overall performance. The proposed exception strategies increase the success rate of the algorithm when the task is transferred to new areas of the workspace, where it is necessary to deal with vision noise and altered dynamic characteristics of the task. He is currently pursuing his PhD on efficient methods for 3D shape-based matching. During the course of this, he has been a visiting researcher at Tampere University of Technology, Finland. He has published in notable conferences within the fields of computer vision and robotics. His research activitie...
It is possible to associate a highly constrained subset of relative 6 DoF poses between two 3D shapes, as long as the local surface orientation, the normal vector, is available at every surface point. Local shape features can be used to find putative point correspondences between the models due to their ability to handle noisy and incomplete data. However, this correspondence set is usually contaminated by outliers in practical scenarios, which has led to many past contributions based on robust detectors such as the Hough transform or RANSAC. The key insight of our work is that a single correspondence between oriented points on the two models is constrained to cast votes in a 1 DoF rotational subgroup of the full group of poses, SE(3). Kernel density estimation allows combining the set of votes efficiently to determine a full 6 DoF candidate pose between the models. This modal pose with the highest density is stable under challenging conditions, such as noise, clutter, and occlusions, and provides the output estimate of our method.We first analyze the robustness of our method in relation to noise and show that it handles high outlier rates much better than RANSAC for the task of 6 DoF pose estimation. We then apply our method to four state of the art data sets for 3D object recognition that contain occluded and cluttered scenes. Our method achieves perfect recall on two LI-DAR data sets and outperforms competing methods on two RGB-D data sets, thus setting a new standard for general 3D object recognition using point cloud data.
We present a three-level cognitive system in a Learning by Demonstration (LbD) context. The system allows for learning and transfer on the sensorimotor level as well as the planning level. The fundamentally different data structures associated to these two levels are connected by an efficient mid-level representation based on so called "Semantic Event Chains". We describe details of the representations and quantify the effect of the associated learning procedures for each level under different amounts of noise. Moreover, we demonstrate the performance of the overall system by three demonstrations that have been performed at a project review. The described system has a Technical Readiness Level (TRL) of 4, which in an ongoing follow-up project will be raised to TRL 6.
Abstract-We present a method for learning object grasp affordance models in 3D from experience, and demonstrate its applicability through extensive testing and evaluation on a realistic and largely autonomous platform. Grasp affordance refers here to relative object-gripper configurations that yield stable grasps. These affordances are represented probabilistically with grasp densities, which correspond to continuous density functions defined on the space of 6D gripper poses. A grasp density characterizes an object's grasp affordance; densities are linked to visual stimuli through registration with a visual model of the object they characterize. We explore a batch-oriented, experience-based learning paradigm where grasps sampled randomly from a density are performed, and an importance-sampling algorithm learns a refined density from the outcomes of these experiences. The first such learning cycle is bootstrapped with a grasp density formed from visual cues. We show that the robot effectively applies its experience by downweighting poor grasp solutions, which results in increased success rates at subsequent learning cycles. We also present success rates in a practical scenario where a robot needs to repeatedly grasp an object lying in an arbitrary pose, where each pose imposes a specific reaching constraint, and thus forces the robot to make use of the entire grasp density to select the most promising achievable grasp.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.