2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01004
|View full text |Cite
|
Sign up to set email alerts
|

SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception

Abstract: Unsupervised learning for geometric perception (depth, optical flow, etc.) is of great interest to autonomous systems. Recent works on unsupervised learning have made considerable progress on perceiving geometry; however, they usually ignore the coherence of objects and perform poorly under scenarios with dark and noisy environments. In contrast, supervised learning algorithms, which are robust, require large labeled geometric dataset. This paper introduces SIGNet, a novel framework that provides robust geomet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3
2

Relationship

2
8

Authors

Journals

citations
Cited by 60 publications
(29 citation statements)
references
References 51 publications
0
29
0
Order By: Relevance
“…Depth estimation as used in this work has been shown to take and give profit through MTL with other tasks as, e.g., semantic segmentation [44], [47], [48], domain adaptation [7], optical flow estimation [49], [50], or 3D pose estimation [4], [24]. Particularly, self-supervised depth estimation has been combined with semantic segmentation [46], [51]- [53] or instance segmentation [36], [54] to mitigate the effect of moving objects, which violate the static world assumption made during training of such models. Also consistency checks for both tasks [45] or unidirectional feature representation improvements [51] have been proven successful.…”
Section: B Multi-task Learningmentioning
confidence: 99%
“…Depth estimation as used in this work has been shown to take and give profit through MTL with other tasks as, e.g., semantic segmentation [44], [47], [48], domain adaptation [7], optical flow estimation [49], [50], or 3D pose estimation [4], [24]. Particularly, self-supervised depth estimation has been combined with semantic segmentation [46], [51]- [53] or instance segmentation [36], [54] to mitigate the effect of moving objects, which violate the static world assumption made during training of such models. Also consistency checks for both tasks [45] or unidirectional feature representation improvements [51] have been proven successful.…”
Section: B Multi-task Learningmentioning
confidence: 99%
“…While this approach can considerably improve the depth estimation performance, it incurs significantly more computation. Other works include new loss functions during training, either via multi-task training [35] or by enforcing segmentation consistency between the warped and real images [3,24,39]. These methods do not require extra semantic computation during test, but require running a semantic network at every training iteration, which still generates a considerable overhead.…”
Section: Related Workmentioning
confidence: 99%
“…Although vision-based direct methods [50], [51] that work with all the raw pixel information in images have better performance in dealing with textureless scenes, they require high computing power (GPUs) to achieve real-time processing, which is unavailable for payloadlimited MAVs. In addition, deep learning based approaches [52], [53] that learn the mapping between the state and the images are insensitive to light conditions and texture. They either require a labor-intensive site survey to label data for supervised learning, or suffer from inferior performance due to the risk of overfitting.…”
Section: Related Workmentioning
confidence: 99%