2019
DOI: 10.48550/arxiv.1912.09678
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal Estimation

Abstract: Indoor robotics localization, navigation and interaction heavily rely on scene understanding and reconstruction. Compared to monocular vision which usually does not explicitly introduce any geometrical constraint, stereo vision based schemes are more promising and robust to produce accurate geometrical information, such as surface normal and depth/disparity. Besides, deep learning models trained with large-scale datasets have shown their superior performance in many stereo vision tasks. However, existing stere… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 40 publications
0
8
0
Order By: Relevance
“…We learn a monocular depth prediction network using a scale-and shift-invariant trimmed loss that operates on an inverse depth representation, together with the gradient-matching loss proposed in [22]. We construct a meta-dataset that includes the original datasets that were used in [30] (referred to as MIX 5 in that work) and extend it with with five additional datasets ( [18,43,44,46,47]). 1.…”
Section: Monocular Depth Estimationmentioning
confidence: 99%
“…We learn a monocular depth prediction network using a scale-and shift-invariant trimmed loss that operates on an inverse depth representation, together with the gradient-matching loss proposed in [22]. We construct a meta-dataset that includes the original datasets that were used in [30] (referred to as MIX 5 in that work) and extend it with with five additional datasets ( [18,43,44,46,47]). 1.…”
Section: Monocular Depth Estimationmentioning
confidence: 99%
“…Simulation software such as UE and Blender are the most widely adopted tools, which allow researchers to build their own scenes, with changeable textures, lighting and weather conditions [26]. Representative datasets constructed using this method include Sintel [21], Scene Flow [20], HR-VS [27], IRS [14] and New Tsukuba CG [28]. In general, disparity images generated by simulation software have the advantages of high accuracy, high density (usually no invalid pixels), and convenient to build large scale datasets.…”
Section: A Stereo Matching Datasetmentioning
confidence: 99%
“…In terms of the accuracy of the disparity map, only three datasets: Middlebury2014 [7], HR-VS [27] and IRS [14] achieved sub-pixel accuracy, and the latter two [14], [27] were synthetic datasets generated by software. Obviously, as the average matching error of the SOTA deep learning stereo matching model has been less than one pixel, most datasets that only provide pixel-level accuracy disparity maps can no longer meet the requirements of the deep learning models.…”
Section: B Dataset Comparisonmentioning
confidence: 99%
See 1 more Smart Citation
“…Additionally, we conduct another interesting evaluation. We use IRS dataset [29], a large synthetic stereo dataset, and Flyingthings3D together to train BGNet and BGNet+. IRS contains more than 100,000 pairs of 960 × 540 resolution stereo images (84,946 training and 15,079 testing) in indoor scenes.…”
Section: Generalization Performancementioning
confidence: 99%