Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching

Zeng, Andy; Song, Shuran; Yu, Kuan-Ting; Donlon, Elliott; Hogan, Francois R.; Bauzá, Maria; Ma, Dongli; Taylor, Orion; Liu, Melody; Romo, Eudald; Fazeli, Nima; Alet, Ferran; Dafle, Nikhil Chavan; Holladay, Rachel; Morona, Isabella; Nair, Prem Qu; Green, Druck; Taylor, Ian; Liu, Weber; Funkhouser, Thomas; Rodríguez, Alberto

doi:10.1177/0278364919868017

Cited by 150 publications

(168 citation statements)

References 50 publications

(111 reference statements)

Supporting

Mentioning

148

Contrasting

Order By: Relevance

“…Also, the availability of affordable RGB-D sensors enabled the use of deep learning techniques to learn the features of objects directly from image data. Recent experimentations on Convolutional neural network [2], [17], [18] have demonstrated that they can be used to efficiently compute stable grasps, Pinto et al [3] used an architecture similar to AlexNet to depict that by increasing the size of the data, their CNN was able to generalize better to new data. Varley et al [19] propose an interesting approach to grasp planning through shape completion where a 3D CNN was used to train the network on 3D prototype of objects in their own dataset captured from various viewpoints.…”

Section: Related Workmentioning

confidence: 99%

Robotic grasp detection using deep convolutional neural networks

Kumra

Kanan

2017

2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

413

320

View full text Add to dashboard Cite

Abstract-Deep learning has significantly advanced computer vision and natural language processing. While there have been some successes in robotics using deep learning, it has not been widely adopted. In this paper, we present a novel robotic grasp detection system that predicts the best grasping pose of a parallel-plate robotic gripper for novel objects using the RGB-D image of the scene. The proposed model uses a deep convolutional neural network to extract features from the scene and then uses a shallow convolutional neural network to predict the grasp configuration for the object of interest. Our multi-modal model achieved an accuracy of 89.21% on the standard Cornell Grasp Dataset and runs at real-time speeds. This redefines the state-of-the-art for robotic grasp detection.

show abstract

Section: Related Workmentioning

confidence: 99%

Robotic grasp detection using deep convolutional neural networks

Kumra

Kanan

2017

2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

413

320

View full text Add to dashboard Cite

show abstract

“…For semantic occlusion segmentation, we extend previous work of semantic (visible) segmentation by fully convolutional networks (FCN) proposed in [1]. All layers are composed of convolutional or pooling layers, which keep the geometry of image, so FCN is known as effective and widely used for pixel-wise score regression tasks: depth prediction [24,25], grasp affordance [12,26,27], optical flow [28], and instance masks [4,6].…”

Section: B Semantic Occlusion Segmentationmentioning

confidence: 99%

Joint Learning of Instance and Semantic Segmentation for Robotic Pick-and-Place with Heavy Occlusions in Clutter

Wada

Okada

Inaba

2019

2019 International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

We present joint learning of instance and semantic segmentation for visible and occluded region masks. Sharing the feature extractor with instance occlusion segmentation, we introduce semantic occlusion segmentation into the instance segmentation model. This joint learning fuses the instanceand image-level reasoning of the mask prediction on the different segmentation tasks, which was missing in the previous work of learning instance segmentation only (instance-only).In the experiments, we evaluated the proposed joint learning comparing the instance-only learning on the test dataset. We also applied the joint learning model to 2 different types of robotic pick-and-place tasks (random and target picking) and evaluated its effectiveness to achieve real-world robotic tasks. …

show abstract

“…One prominent field is robotic manipulation, in which industrial parts are grasped and are placed to a desired location [62], [56], [57], [60], [91], [202], [200]. Amazon Picking Challenge (APC) [1] is an important example demonstrating how object detection and 6D pose estimation, when successfully performed, improves the autonomy of the manipulation, regarding the automated handling of parts by robots [55], [58], [59]. Household robotics is another field where the ability of recognizing objects and accurately estimating their poses is a key element [97], [98], [99].…”

Section: Introductionmentioning

confidence: 99%

A review on object pose recovery: From 3D bounding box detectors to full 6D pose estimators

Şahin

Garcia-Hernando

Sock

et al. 2020

Image and Vision Computing

View full text Add to dashboard Cite

Object pose recovery has gained increasing attention in the computer vision field as it has become an important problem in rapidly evolving technological areas related to autonomous driving, robotics, and augmented reality. Existing review-related studies have addressed the problem at visual level in 2D, going through the methods which produce 2D bounding boxes of objects of interest in RGB images. The 2D search space is enlarged either using the geometry information available in the 3D space along with RGB (Mono/Stereo) images, or utilizing depth data from LIDAR sensors and/or RGB-D cameras. 3D bounding box detectors, producing category-level amodal 3D bounding boxes, are evaluated on gravity aligned images, while full 6D object pose estimators are mostly tested at instance-level on the images where the alignment constraint is removed. Recently, 6D object pose estimation is tackled at the level of categories. In this paper, we present the first comprehensive and most recent review of the methods on object pose recovery, from 3D bounding box detectors to full 6D pose estimators. The methods mathematically model the problem as a classification, regression, classification & regression, template matching, and point-pair feature matching task. Based on this, a mathematical-model-based categorization of the methods is established. Datasets used for evaluating the methods are investigated with respect to the challenges, and evaluation metrics are studied. Quantitative results of experiments in the literature are analysed to show which category of methods best performs across what types of challenges. The analyses are further extended comparing two methods, which are our own implementations, so that the outcomes from the public results are further solidified. Current position of the field is summarized regarding object pose recovery, and possible research directions are identified.

show abstract

Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching

Cited by 150 publications

References 50 publications

Robotic grasp detection using deep convolutional neural networks

Robotic grasp detection using deep convolutional neural networks

Joint Learning of Instance and Semantic Segmentation for Robotic Pick-and-Place with Heavy Occlusions in Clutter

A review on object pose recovery: From 3D bounding box detectors to full 6D pose estimators

Contact Info

Product

Resources

About