Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data

Danielczuk, Michael; Matl, Matthew; Gupta, Saurabh; Li, Andrew; Lee, Andrew; Mahler, Jeffrey; Goldberg, Ken

doi:10.1109/icra.2019.8793744

Cited by 177 publications

(120 citation statements)

References 48 publications

(84 reference statements)

Supporting

Mentioning

108

Contrasting

Unclassified

Order By: Relevance

“…This technique has been successfully used for object localization [48••], segmentation [74], robot control for pick-andplace [75], swing-peg-in-hole [76], opening a cabinet drawer [76], in-hand manipulation [77], one-handed Rubik's Cube solving [78], precise 6D pose regression in highly cluttered environments [20•], etc. Modifications propose an automatic scheduling of the intensity of the randomization based on the current performance of the system [78] or adapting simulation randomizations by using real-world data to identify distributions that are particularly suited for a successful transfer [76].…”

Section: Domain Randomizationmentioning

confidence: 99%

A Survey on Learning-Based Robotic Grasping

et al. 2020

View full text Add to dashboard Cite

Purpose of Review This review provides a comprehensive overview of machine learning approaches for vision-based robotic grasping and manipulation. Current trends and developments as well as various criteria for categorization of approaches are provided. Recent Findings Model-free approaches are attractive due to their generalization capabilities to novel objects, but are mostly limited to top-down grasps and do not allow a precise object placement which can limit their applicability. In contrast, model-based methods allow a precise placement and aim for an automatic configuration without any human intervention to enable a fast and easy deployment. Summary Both approaches to robotic grasping and manipulation with and without object-specific knowledge are discussed. Due to the large amount of data required to train AI-based approaches, simulations are an attractive choice for robot learning. This article also gives an overview of techniques and achievements in transfers from simulations to the real world.

show abstract

Section: Domain Randomizationmentioning

confidence: 99%

A Survey on Learning-Based Robotic Grasping

et al. 2020

View full text Add to dashboard Cite

show abstract

“…For our work, we chose an open source implementation by Matterport [34] in Python with the use of Keras and Tensorflow frameworks. The model and its implementation has already gained popularity among various researchers [35][36][37][38][39][40] for of several reasons-The model is published under MIT license, which allows users to modify the model; it adopts well established CNN backbone ResNet [41] and recently introduced concepts like Feature Pyramid Network (FPN) [42] and ROI Align that in terms of quality make Mask R-CNN superior to comparable models like Faster R-CNN; the maximum accepted input image resolution of the model (1024 × 1024 pixels) is high when compared to many previously developed CNN models like YOLO (up to 608 × 608 pixels) [43] or Faster R-CNN [42] Python implementation (600 × 1000 pixels). The ability to analyze higher resolution images is important especially when dealing with small objects like biological cells [35].…”

Section: Cnn Quantification Methodsmentioning

confidence: 99%

Convolutional Neural Networks–Based Image Analysis for the Detection and Quantification of Neutrophil Extracellular Traps

Manda-Handzlik

Fiok

Cieloch

et al. 2020

Cells

View full text Add to dashboard Cite

Over a decade ago, the formation of neutrophil extracellular traps (NETs) was described as a novel mechanism employed by neutrophils to tackle infections. Currently applied methods for NETs release quantification are often limited by the use of unspecific dyes and technical difficulties. Therefore, we aimed to develop a fully automatic image processing method for the detection and quantification of NETs based on live imaging with the use of DNA-staining dyes. For this purpose, we adopted a recently proposed Convolutional Neural Network (CNN) model called Mask R-CNN. The adopted model detected objects with quality comparable to manual counting—Over 90% of detected cells were classified in the same manner as in manual labelling. Furthermore, the inhibitory effect of GW 311616A (neutrophil elastase inhibitor) on NETs release, observed microscopically, was confirmed with the use of the CNN model but not by extracellular DNA release measurement. We have demonstrated that a modern CNN model outperforms a widely used quantification method based on the measurement of DNA release and can be a valuable tool to quantitate the formation process of NETs.

show abstract

“…In case the objects do not have colour or textural information, this approach fails to produce reliable results. More recently, people used different Artificial Neural Networks (ANN) for 3d object recognition and 6d pose estimation [ 18 , 19 , 28 , 29 ]. Gupta et al [ 18 ] used both colour images and depth features to train a Convolutional Neural Network (CNN) model.…”

Section: Related Workmentioning

confidence: 99%

“…To improve the time performance of shape retrieval approaches, researchers [ 16 ] suggested to move the heavy computation processes to the offline stages. With the affordable price and accessibility of the RGB-D sensors, researchers proposed different object detection and pose estimation methods using both optical and depth information [ 10 , 17 , 18 , 19 , 20 ]. Although these methods usually outperform the approaches based on optical information only, depth sensors have a limited capturing angle and are more sensitive to illumination conditions.…”

Section: Introductionmentioning

confidence: 99%

Marker-Less 3d Object Recognition and 6d Pose Estimation for Homogeneous Textureless Objects: An RGB-D Approach

Hajari

Lugo

Sharma

et al. 2020

Sensors

View full text Add to dashboard Cite

The task of recognising an object and estimating its 6d pose in a scene has received considerable attention in recent years. The accessibility and low-cost of consumer RGB-D cameras, make object recognition and pose estimation feasible even for small industrial businesses. An example is the industrial assembly line, where a robotic arm should pick a small, textureless and mostly homogeneous object and place it in a designated location. Despite all the recent advancements of object recognition and pose estimation techniques in natural scenes, the problem remains challenging for industrial parts. In this paper, we present a framework to simultaneously recognise the object’s class and estimate its 6d pose from RGB-D data. The proposed model adapts a global approach, where an object and the Region of Interest (ROI) are first recognised from RGB images. The object’s pose is then estimated from the corresponding depth information. We train various classifiers based on extracted Histogram of Oriented Gradient (HOG) features to detect and recognize the objects. We then perform template matching on the point cloud based on surface normal and Fast Point Feature Histograms (FPFH) to estimate the pose of the object. Experimental results show that our system is quite efficient, accurate and robust to illumination and background changes, even for the challenging objects of Tless dataset. 6d pose estimation; 3d object recognition; textureless objects; homogeneous objects

show abstract

Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data

Cited by 177 publications

References 48 publications

A Survey on Learning-Based Robotic Grasping

A Survey on Learning-Based Robotic Grasping

Convolutional Neural Networks–Based Image Analysis for the Detection and Quantification of Neutrophil Extracellular Traps

Marker-Less 3d Object Recognition and 6d Pose Estimation for Homogeneous Textureless Objects: An RGB-D Approach

Contact Info

Product

Resources

About