This paper presents the Dexterity Network (DexNet) 1.0, a dataset of 3D object models and a sampling-based planning algorithm to explore how Cloud Robotics can be used for robust grasp planning. The algorithm uses a MultiArmed Bandit model with correlated rewards to leverage prior grasps and 3D object models in a growing dataset that currently includes over 10,000 unique 3D object models and 2.5 million parallel-jaw grasps. Each grasp includes an estimate of the probability of force closure under uncertainty in object and gripper pose and friction. Dex-Net 1.0 uses Multi-View Convolutional Neural Networks (MV-CNNs), a new deep learning method for 3D object classification, to provide a similarity metric between objects, and the Google Cloud Platform to simultaneously run up to 1,500 virtual cores, reducing experiment runtime by up to three orders of magnitude. Experiments suggest that correlated bandit techniques can use a cloud-based network of object models to significantly reduce the number of samples required for robust grasp planning. We report on system sensitivity to variations in similarity metrics and in uncertainty in pose and friction. Code and updated information is available at http://berkeleyautomation.github.io/dex-net/.
Abstract-We present a unified framework for grasp planning and in-hand grasp adaptation using visual, tactile and proprioceptive feedback. The main objective of the proposed framework is to enable fingertip grasping by addressing problems of changed weight of the object, slippage and external disturbances. For this purpose, we introduce the Hierarchical Fingertip Space (HFTS) as a representation enabling optimization for both efficient grasp synthesis and online finger gaiting. Grasp synthesis is followed by a grasp adaptation step that consists of both grasp force adaptation through impedance control and regrasping/finger gaiting when the former is not sufficient. Experimental evaluation is conducted on an Allegro hand mounted on a Kuka LWR arm.
Abstract-For applications such as warehouse order fulfillment, robot grasps must be robust to uncertainty arising from sensing, mechanics, and control. One way to achieve robustness is to evaluate the performance of candidate grasps by sampling perturbations in shape, pose, and gripper approach and to compute the probability of force closure for each candidate to identify a grasp with the highest expected quality. Since evaluating the quality of each grasp is computationally demanding, prior work has turned to cloud computing. To improve computational efficiency and to extend this work, we consider how MultiArmed Bandit (MAB) models for optimizing decisions can be applied in this context. We formulate robust grasp planning as a MAB problem and evaluate convergence times towards an optimal grasp candidate using 100 object shapes from the Brown Vision 2D Lab Dataset with 1000 grasp candidates per object. We consider the case where shape uncertainty is represented as a Gaussian process implicit surface (GPIS) with Gaussian uncertainty in pose, gripper approach angle, and coefficient of friction. We find that Thompson Sampling and the Gittins index MAB methods converged to within 3% of the optimal grasp up to 10x faster than uniform allocation and 5x faster than iterative pruning.
In this work, we present a sampling-based approach to trajectory classification which enables automated high-level reasoning about topological classes of trajectories. Our approach is applicable to general configuration spaces and relies only on the availability of collision free samples. Unlike previous sampling-based approaches in robotics which use graphs to capture information about the pathconnectedness of a configuration space, we construct a multiscale approximation of neighborhoods of the collision free configurations based on filtrations of simplicial complexes. Our approach thereby extracts additional homological information which is essential for a topological trajectory classification. We propose a multiscale classification algorithm for trajectories in configuration spaces of arbitrary dimension and for sets of trajectories starting and ending in two fixed points. Using a cone construction, we then generalize this approach to classify sets of trajectories even when trajectory start and end points are allowed to vary in path-connected subsets. We furthermore show how an augmented filtration of simplicial complexes based on an arbitrary function on the configuration space, such as a costmap, can be defined to incorporate additional constraints. We present an evaluation of our approach in 2, 3, 4 and 6 dimensional configuration spaces in simulation and in real-world experiments using a Baxter robot and motion capture data 1 .
Reinforcement Learning (RL) struggles in problems with delayed rewards, and one approach is to segment the task into sub-tasks with incremental rewards. We propose a framework called Hierarchical Inverse Reinforcement Learning (HIRL), which is a model for learning sub-task structure from demonstrations. HIRL decomposes the task into sub-tasks based on transitions that are consistent across demonstrations. These transitions are defined as changes in local linearity w.r.t to a kernel function [21]. Then, HIRL uses the inferred structure to learn reward functions local to the sub-tasks but also handle any global dependencies such as sequentiality.We have evaluated HIRL on several standard RL benchmarks: Parallel Parking with noisy dynamics, Two-Link Pendulum, 2D Noisy Motion Planning, and a Pinball environment. In the parallel parking task, we find that rewards constructed with HIRL converge to a policy with an 80% success rate in 32% fewer time-steps than those constructed with Maximum Entropy Inverse RL (MaxEnt IRL), and with partial state observation, the policies learned with IRL fail to achieve this accuracy while HIRL still converges. We further find that that the rewards learned with HIRL are robust to environment noise where they can tolerate 1 stdev. of random perturbation in the poses in the environment obstacles while maintaining roughly the same convergence rate. We find that HIRL rewards can converge up-to 6× faster than rewards constructed with IRL.
For applications such as Amazon warehouse order fulfillment, robots must grasp a desired object amid clutter: other objects that block direct access. This can be difficult to program explicitly due to uncertainty in friction and push mechanics and the variety of objects that can be encountered. Deep Learning networks combined with Online Learning from Demonstration (LfD) algorithms such as DAgger and SHIV have potential to learn robot control policies for such tasks where the input is a camera image and system dynamics and the cost function are unknown. To explore this idea, we introduce a version of the grasping in clutter problem where a yellow cylinder must be grasped by a planar robot arm amid extruded objects in a variety of shapes and positions. To reduce the burden on human experts to provide demonstrations, we propose using a hierarchy of three levels of supervisors: a fast motion planner that ignores obstacles, crowd-sourced human workers who provide appropriate robot control values remotely via online videos, and a local human expert. Physical experiments suggest that with 160 expert demonstrations, using the hierarchy of supervisors can increase the probability of a successful grasp (reliability) from 55% to 90%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.