2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00611
|View full text |Cite
|
Sign up to set email alerts
|

RGB-Depth Fusion GAN for Indoor Depth Completion

Abstract: The raw depth image captured by indoor depth sensors usually has an extensive range of missing depth values due to inherent limitations such as the inability to perceive transparent objects and the limited distance range. The incomplete depth map with missing values burdens many downstream vision tasks, and a rising number of depth completion methods have been proposed to alleviate this issue. While most existing methods can generate accurate dense depth maps from sparse and uniformly sampled depth maps, they … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 23 publications
(6 citation statements)
references
References 68 publications
(134 reference statements)
0
6
0
Order By: Relevance
“…To further investigate the effect of the proposed crossmodal attention module, we conducted a comparative analysis with other attention modules that have demonstrated success in various tasks. These attention modules, including Nonlocal [54] (2018), Criss-cross [55] (2019), Dual-attn [56] (2019), Point attention [57] (2020), W-AdaIN [58] (2022), and EMAF [59] (2023), were integrated into our framework. For a fair comparison, we removed our cross-modal attention TABLE 3: Object pose estimation results (AUC) with different attention modules, evaluated on the pallet RGBD dataset.…”
Section: Resultsmentioning
confidence: 99%
“…To further investigate the effect of the proposed crossmodal attention module, we conducted a comparative analysis with other attention modules that have demonstrated success in various tasks. These attention modules, including Nonlocal [54] (2018), Criss-cross [55] (2019), Dual-attn [56] (2019), Point attention [57] (2020), W-AdaIN [58] (2022), and EMAF [59] (2023), were integrated into our framework. For a fair comparison, we removed our cross-modal attention TABLE 3: Object pose estimation results (AUC) with different attention modules, evaluated on the pallet RGBD dataset.…”
Section: Resultsmentioning
confidence: 99%
“…Images were obtained from two popular datasets, namely the BEPM and OSU datasets. The fused results were obtained using CVT [35], DTCWT [36], LP [37], RP [38], MSVD [39], FusionGAN [40], and our proposed IFGAN. Based on the observations made in Figure 16, it becomes evident that our approach excels at preserving comprehensive details and distinctive characteristics derived from the viewable picture.…”
Section: Performance Comparison Of Ifgan With Other Image Fusion Modelsmentioning
confidence: 99%
“…The authors of [23] propose a style transformation method for generating complete depth maps. The authors of [24] design a network with two branches: the first branch uses an encodingdecoding approach to convert sparse or incomplete depth maps into complete depth maps, while the other branch uses a GAN network to perform depth map style transformation on RGB images to generate depth maps for restoration. The authors of [25] use domain adaptation methods to design and train networks, generating geometric information or noise on synthetic datasets to mimic real datasets.…”
Section: Related Workmentioning
confidence: 99%