MVS2: Deep Unsupervised Multi-View Stereo with Multi-View Symmetry

Dai, Yuchao; Zhang, Zhidong; Rao, Zhibo; Li, Bo

doi:10.1109/3dv.2019.00010

Cited by 74 publications

(76 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, due to the unideal run-time and memory requirements, the cascade pyramid structure [2] is proposed to build cost volume and infer depth in coarse to fine, which greatly reduces run-time and memory consumption. Besides, some unsupervised methods [7,3] are proposed to overcome the difficulty of obtaining ground-truth depth maps. These methods utilize photometric consistency and multi-view consistency to guide the training of the network.…”

Section: Related Workmentioning

confidence: 99%

“…Our method consists of three modules: multi-view preprocessing, an adaptive encoder-decoder network, postprocessing with shared occlusion masks, as shown in Figure 2. Inspired by present learning-based multi-view methods, such as [1,3], which use front-to-parallel planes at different depth as hypothesis planes, the first step of our pre-processing is to adopt homography to warp source images and then construct cost volumes, as shown in Figure 2. Given input sequences in the form of image-pose pairs, each of which contains a reference image I r for depth estimation, several additional source images {I i } N i=1 , and camera parameters.…”

Section: Network Architecturementioning

confidence: 99%

“…Therefore, similar to existing unsupervised MVS methods [3,4], we reconstruct the target view to obtain the supervisory signal, which can be easily extended to different application scenarios like real-time depth estimation for virtual view synthesis. Moreover, previous work tends to utilize Fig.…”

Section: Introductionmentioning

confidence: 99%

“…So the proposed novel mask not only can deal with the occlusion problem but also can be used as a confidence fusion weight map to guide reliable matching. Compared with the occlusion processing strategies adopted by other unsupervised methods [7,3,4], our post-processing can effectively improve the accuracy and contrast of the esti-978-1-6654-4989-2/21/$31.00 ©2021 IEEE In this paper, we pursue real-time performance while improving the quality of depth estimation. We carefully adjusted an efficient and adaptive encoder-decoder network.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Real-Time Unsupervised Multi-View Depth Estimation Network For Virtual View Synthesis

Qiu

Liu

et al. 2021

2021 IEEE International Conference on Multimedia &Amp; Expo Workshops (ICMEW)

View full text Add to dashboard Cite

The existing learning-based multi-view stereo (MVS) approaches achieve impressive results compared with traditional methods. However, most of them rely on ground-truth 3D data as supervision, and the acquisition of high-quality ground truth for various scenes is a challenging problem. In this paper, we propose a novel real-time unsupervised multiview depth estimation network for virtual view synthesis tasks and take multi-view images as supervision. To improve the completeness and accuracy of the virtual viewpoint, we propose a novel shared occlusion mask to deal with the artifacts caused by occlusion in the reconstructed image, and filter out the unreliable points in the depth map. Besides, we also design a mask-based photometric loss to guide our network to generate more reasonable masks and high-quality depth maps. Experimental results on the IEEE1857.9 virtual viewpoint synthesis dataset demonstrate that our proposed method outperforms other recent MVS methods and achieves more excellent real-time performance.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Network Architecturementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Real-Time Unsupervised Multi-View Depth Estimation Network For Virtual View Synthesis

Qiu

Liu

et al. 2021

2021 IEEE International Conference on Multimedia &Amp; Expo Workshops (ICMEW)

View full text Add to dashboard Cite

show abstract

“…While it is possible to train the network using this synthetic data, for successfully deploying the model in real scenes, we still require to fine-tune the model using data from the target domain [16]. Another alternative is adopting an unsupervised learning strategy [6,13]. In this case, the few existing unsupervised MVS approaches use an image reconstruction loss to supervise the training process.…”

Section: Introductionmentioning

confidence: 99%