Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects

Dai, Qiyu; Jiyao, Zhang,; Li, Qiwei; Wu, Tianhao; Dong, Hui; Liu, Ziyuan; Tan, P.; Wang, He

doi:10.48550/arxiv.2208.03792

Cited by 1 publication

(3 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…ClearGrasp [1] employed depth completion for use with pose estimation on robotic grasping tasks, where they trained three DeepLabv3+ [19] models to learn transparency mask, surface normal, and boundary, respectively. Follow-on studies developed different approaches for depth completion, including implicit functions [20], NeRF reconstruction [2], combined point cloud and depth features [14], adversarial learning [21], multi-view geometry [22], RGB image completion [3], and sim2real transfer [6]. Weng et al [4] used transfer learning from the RGB to the depth sensor domain without completing raw depth.…”

Section: A Transparent Object Visual Perception For Manipulationmentioning

confidence: 99%

“…However, these techniques require high-quality depth input provided by opaque objects with Lambertian light reflectance. One recent work by Dai et al [6] proposed a data generation system that simulates the noise on non-Lambertian surfaces of active stereo depth cameras and demonstrated its usage in category-level pose estimation and robotic grasping.…”

Section: B Opaque Object Category-level Pose Estimationmentioning

confidence: 99%

“…Recent works have shown promising results on grasping transparent objects by completing the missing depth values followed by the use of a geometry-based grasp engine [1], [2], [3], transfer learning from RGB-based grasping neural networks [4], light-field feature learning [5], or domainrandomized depth noise simulation [6]. For more advanced manipulation tasks such as rigid body pick-and-place or liquid pouring, geometry-based estimations, such as symmetrical axes, edges [7] or object poses [8], [9], [6], are required to model the manipulation trajectories. Instance-level transparent object poses could be estimated from keypoints on stereo RGB images [10], [11], a light-field camera [12], [13], or directly from a single RGB-D image [9] with support plane assumptions.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations