Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge

Zeng, Andy; Yu, Kuan-Ting; Song, Shuran; Suo, Daniel; Walker, Ed; Rodríguez, Alberto; Xiao, Jianxiong

doi:10.1109/icra.2017.7989165

Cited by 436 publications

(296 citation statements)

References 22 publications

Supporting

Mentioning

294

Contrasting

Unclassified

Order By: Relevance

“…Conversely, in the third example (Figure 11, third column), the left hand and object are correctly predicted to be separated. We also found that hands and objects can be detected in isolation, which reaffirms similar observations made in previous works [7,28].…”

Section: Detecting Unknown Objectssupporting

confidence: 92%

“…These are then used in a complex multi-stage classification scheme for offline action recognition, while our approach uses the FCN outputs to discriminate the hand from the object and to generate pixel labels for real-time tracking. Similar to our approach, [28] used FCNs to discriminate between objects for 6D pose estimation. While this approach produces good results for object localization in cluttered environments, it requires a multicamera setup and does not achieve real-time performance, which is crucial for dynamic hand-object interactions.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Hand-Object Interaction Detection with Fully Convolutional Networks

Schröder

Ritter

2017

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

Detecting hand-object interactions is a challenging problem with many applications in the human-computer interaction domain. We present a real-time method that automatically detects hand-object interactions in RGBD sensor data and tracks the object's rigid pose over time. The detection is performed using a fully convolutional neural network, which is purposefully trained to discern the relationship between hands and objects and which predicts pixel-wise class probabilities. This output is used in a probabilistic pixel labeling strategy that explicitly accounts for the uncertainty of the prediction. Based on the labeling of object pixels, the object is tracked over time using modelbased registration. We evaluate the accuracy and generalizability of our approach and make our annotated RGBD dataset as well as our trained models publicly available.

show abstract

Section: Detecting Unknown Objectssupporting

confidence: 92%

Section: Related Workmentioning

confidence: 99%

Hand-Object Interaction Detection with Fully Convolutional Networks

Schröder

Ritter

2017

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

show abstract

“…Taking an object pose estimation task [10] as an example, the input data includes a set of images and the total size of it could be hundreds of kilobytes at least, while the output data is just the object location and pose, and takes a few dozens of bytes at most.…”

Section: ) Local Computingmentioning

confidence: 99%

Cost Aware Offloading Selection and Resource Allocation for Cloud Based Multi-Robot Systems

Sun

Zhou

Yang

2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYIn this letter, we investigate the computation offloading problem in cloud based multi-robot systems, in which user weights, communication interference and cloud resource limitation are jointly considered. To minimize the system cost, two offloading selection and resource allocation algorithms are proposed. Numerical results show that the proposed algorithms both can greatly reduce the overall system cost, and the greedy selection based algorithm even achieves near-optimal performance.

show abstract

“…In addition to these works on 2D segmentation, three-dimentional segmentation is required for robot to conduct tasks in the real world. In order to achieve this, previous works propose projection-based approach projecting segmented pixels to 3D points in a single view (2.5D) [9], mapping-based approach with binary object existence [12] and probabilistic existence [1] for a single target object. And as for fully 3D-based approach, model matching is tackled [13] [14] using various 3D features [15] [16].…”

Section: D Multilabel Mapping For Object Segmentation and Manipumentioning

confidence: 99%

Probabilistic 3D multilabel real-time mapping for multi-object manipulation

Wada¹,

Okada²,

Inaba³

2017

2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

Probabilistic 3D map has been applied to object segmentation with multiple camera viewpoints, however, conventional methods lack of real-time efficiency and functionality of multilabel object mapping. In this paper, we propose a method to generate three-dimensional map with multilabel occupancy in real-time. Extending our previous work [1] in which only target label occupancy is mapped, we achieve multilabel object segmentation in a single looking around action. We evaluate our method by testing segmentation accuracy with 39 different objects, and applying it to a manipulation task of multiple objects in the experiments. Our mapping-based method outperforms the conventional projection-based method by 40 -96% relative (12.6 mean IU 3d ), and robot successfuly recognizes (86.9%) and manipulates multiple objects (60.7%) in an environment with heavy occlusions.

show abstract

Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge

Cited by 436 publications

References 22 publications

Hand-Object Interaction Detection with Fully Convolutional Networks

Hand-Object Interaction Detection with Fully Convolutional Networks

Cost Aware Offloading Selection and Resource Allocation for Cloud Based Multi-Robot Systems

Probabilistic 3D multilabel real-time mapping for multi-object manipulation

Contact Info

Product

Resources

About