2021
DOI: 10.1609/aaai.v35i4.16411
|View full text |Cite
|
Sign up to set email alerts
|

Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation

Abstract: Recent studies have witnessed that self-supervised methods based on view synthesis obtain clear progress on multi-view stereo (MVS). However, existing methods rely on the assumption that the corresponding points among different views share the same color, which may not always be true in practice. This may lead to unreliable self-supervised signal and harm the final reconstruction performance. To address the issue, we propose a framework integrated with more reliable supervision guided by semantic co-segmentati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 50 publications
(26 citation statements)
references
References 33 publications
0
20
0
Order By: Relevance
“…1, DS-MVSNet achieves the best accuracy, completeness, and overall score among coarse-to-fine supervised methods. Especially for completeness, the metric is improved by 10% compared with U-MVSNet-MS [30]. Fig.…”
Section: Evaluation On Dtu Datasetmentioning
confidence: 94%
See 2 more Smart Citations
“…1, DS-MVSNet achieves the best accuracy, completeness, and overall score among coarse-to-fine supervised methods. Especially for completeness, the metric is improved by 10% compared with U-MVSNet-MS [30]. Fig.…”
Section: Evaluation On Dtu Datasetmentioning
confidence: 94%
“…However, these methods cannot be trained in an end-to-end manner. JDACS [30] proposed an end-to-end network, supervised by photometric consistency, segmentation map and augmentation data. However, it requires a pretrained feature extraction backbone for segmentation and inferring two times due to data augmentation.…”
Section: Unsupervised Learning-based Mvsmentioning
confidence: 99%
See 1 more Smart Citation
“…The accuracy and completeness of the network reconstruction point cloud and the generalization ability of the network are better than most 3D reconstruction methods of the same period. Therefore, this method is widely used for depth estimation in most deep learning MVS networks [ 15 , 16 , 17 , 18 , 19 , 20 ]. However, to ensure the accuracy of depth calculation, the storage requirement is three times that of image resolution.…”
Section: Introductionmentioning
confidence: 99%
“…In this architecture, multi-scale pyramid feature aggregation is used to construct a 3D cost volume with more context information, and the loss function combines pixel loss and feature loss. In 2021, Xu et al [ 20 ] combined data augmentation and semantic segmentation as self-supervised signals, making the reconstruction effect comparable to that of the most advanced supervised learning networks. Yang et al [ 27 ] comprehensively used various methods such as deep fusion, mesh generation and deep rendering in unsupervised networks to optimize the pseudo depth.…”
Section: Introductionmentioning
confidence: 99%