Multi-Modal Geometric Learning for Grasping and Manipulation

Watkins-Valls, David; Varley, Jacob; Allen, Peter K.

doi:10.1109/icra.2019.8794233

Cited by 49 publications

(54 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To date, many grasping methods rely fully or in part on deep learning. Some methods only use deep learning to extract additional information about objects with e.g., shape completion [9], [10] or tactile information [11] and then use analytical methods to plan the actual grasp [12], while others employ data-driven grasp planning in an endto-end fashion to generate grasps directly from images [1]- [8]. We will review both shape completion and end-to-end data-driven grasp planning as both are vital parts of our grasping pipeline.…”

Section: Related Workmentioning

confidence: 99%

“…In the context of shape completion from incomplete pointclouds, most recent improvements come from the adoption of deep learning. For instance, different works have explored tailored network structures [9], [13], [14], semantic object classification to aid the reconstruction [15], the integration of other sensing modalities such as tactile information [11], or the exploitation of the network uncertainty [10].…”

Section: A Deep Shape Completionmentioning

confidence: 99%

“…In the context of robotics grasping, [9], [11] and [10] are the most interesting as they not only focus on shape reconstruction quality but also on grasping accuracy. In this work, we make use of our previous shape completion network [10] to complete objects but instead of planning grasps with analytical methods-which is computationally expensive-we turn to data-driven grasp planning.…”

Section: A Deep Shape Completionmentioning

confidence: 99%

See 2 more Smart Citations

Beyond Top-Grasps Through Scene Completion

Lundell

Verdoja

Kyrki

2020

2020 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

Current end-to-end grasp planning methods propose grasps in the order of seconds that attain high grasp success rates on a diverse set of objects, but often by constraining the workspace to top-grasps. In this work, we present a method that allows end-to-end top-grasp planning methods to generate full six-degree-of-freedom grasps using a single RGB-D view as input. This is achieved by estimating the complete shape of the object to be grasped, then simulating different viewpoints of the object, passing the simulated viewpoints to an end-to-end grasp generation method, and finally executing the overall best grasp. The method was experimentally validated on a Franka Emika Panda by comparing 429 grasps generated by the state-of-the-art Fully Convolutional Grasp Quality CNN, both on simulated and real camera images. The results show statistically significant improvements in terms of grasp success rate when using simulated images over real camera images, especially when the real camera viewpoint is angled.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: A Deep Shape Completionmentioning

confidence: 99%

Section: A Deep Shape Completionmentioning

confidence: 99%

See 1 more Smart Citation

Beyond Top-Grasps Through Scene Completion

Lundell

Verdoja

Kyrki

2020

2020 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

show abstract

“…Approaches to grasp synthesis can be classified into analytic and empirical methods; see Bohg et al [15] for a survey. Analytic approaches use physics-based contact models to compute force closure on an object, using the shape and estimated pose of the target object [16], [17], [18], but work poorly in the real world due to noisy sensing, simplified assumptions of contact physics, and difficulty in placing contact points accurately.…”

Section: B Grasp Synthesismentioning

confidence: 99%

Multi-Modal Transfer Learning for Grasping Transparent and Specular Objects

Weng

Pallankize

Tang

et al. 2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

State-of-the-art object grasping methods rely on depth sensing to plan robust grasps, but commercially available depth sensors fail to detect transparent and specular objects. To improve grasping performance on such objects, we introduce a method for learning a multi-modal perception model by bootstrapping from an existing uni-modal model. This transfer learning approach requires only a pre-existing uni-modal grasping model and paired multi-modal image data for training, foregoing the need for ground-truth grasp success labels nor real grasp attempts. Our experiments demonstrate that our approach is able to reliably grasp transparent and reflective objects. Video and supplementary material are available at https://sites.google.com/view/transparent-specular-grasping.

show abstract

“…Despite the fact that they successfully added context information to the voxel representations, their approach still required further improvements since the shape details of small 3D objects were missed. Later, Varley et al [3] presented a CNN approach for 3D shape reconstruction as part of a robot grasp planning algorithm from a single depth view [4]. This approach combined 3D convolutional layers with various fully connected layers to infer the complete 3D shape.…”

Section: Introductionmentioning

confidence: 99%

Trilateral convolutional neural network for 3D shape reconstruction of objects from a single depth view

et al. 2019

View full text Add to dashboard Cite

In this study, the authors propose a novel three‐dimensional (3D) convolutional neural network for shape reconstruction via a trilateral convolutional neural network (Tri‐CNN) from a single depth view. The proposed approach produces a 3D voxel representation of an object, derived from a partial object surface in a single depth image. The proposed Tri‐CNN combines three dilated convolutions in 3D to expand the convolutional receptive field more efficiently to learn shape reconstructions. To evaluate the proposed Tri‐CNN in terms of reconstruction performance, the publicly available ShapeNet and Big Data for Grasp Planning data sets are utilised. The reconstruction performance was evaluated against four conventional deep learning approaches: namely, fully connected convolutional neural network, baseline CNN, autoencoder CNN, and a generative adversarial reconstruction network. The proposed experimental results show that Tri‐CNN produces superior reconstruction results in terms of intersection over union values and Brier scores with significantly less number of model parameters and memory.

show abstract

Multi-Modal Geometric Learning for Grasping and Manipulation

Cited by 49 publications

References 36 publications

Beyond Top-Grasps Through Scene Completion

Beyond Top-Grasps Through Scene Completion

Multi-Modal Transfer Learning for Grasping Transparent and Specular Objects

Trilateral convolutional neural network for 3D shape reconstruction of objects from a single depth view

Contact Info

Product

Resources

About