2021
DOI: 10.1609/aaai.v35i2.16185
|View full text |Cite
|
Sign up to set email alerts
|

Attention-based Multi-Level Fusion Network for Light Field Depth Estimation

Abstract: Depth estimation from Light Field (LF) images is a crucial basis for LF related applications. Since multiple views with abundant information are available, how to effectively fuse features of these views is a key point for accurate LF depth estimation. In this paper, we propose a novel attention-based multi-level fusion network. Combined with the four-branch structure, we design intra-branch fusion strategy and inter-branch fusion strategy to hierarchically fuse effective features from different views. By intr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
11
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(17 citation statements)
references
References 29 publications
(37 reference statements)
0
11
0
Order By: Relevance
“…Tsai et al 12 proposed an attention-based view selection network to adaptively incorporate all angular *qiu.jun.cn@ieee.org views for depth estimation. Chen et al 13 introduced a method for obtaining light field disparity maps based on an attention mechanism for stereo matching. However, the attention mechanism, which primarily focuses on selecting views with higher similarity for disparity estimation, may be less effective in handling occluded areas where the views might have lower similarity to each other.…”
Section: Introductionmentioning
confidence: 99%
“…Tsai et al 12 proposed an attention-based view selection network to adaptively incorporate all angular *qiu.jun.cn@ieee.org views for depth estimation. Chen et al 13 introduced a method for obtaining light field disparity maps based on an attention mechanism for stereo matching. However, the attention mechanism, which primarily focuses on selecting views with higher similarity for disparity estimation, may be less effective in handling occluded areas where the views might have lower similarity to each other.…”
Section: Introductionmentioning
confidence: 99%
“…view synthesis [11] and super-resolution [12]. In recent studies, many deep learning (DL) based algorithms are proposed and have achieved significant improvement in the estimation of disparity information [13]- [15]. Convolutional…”
Section: Introductionmentioning
confidence: 99%
“…Alperovich et al [13] propose a fully convolutional autoencoder using 3D convolutions for disparity estimation of light fields. Similarly, Tsai et al [14] and Chen et al [15] utilize a mixture of 3D and 2D convolutional layers in learning-based disparity estimation networks. Although the 3D convolution extracts spatio-temporal information and is beneficial for volumetric data, it is computationally expensive and leads to significant memory consumption [16].…”
Section: Introductionmentioning
confidence: 99%
“…Light field (LF) cameras can capture both intensity and directions of light rays, and record 3D geometry in a convenient and efficient manner. By encoding 3D scene cues into 4D LF images (i.e., 2D for spatial dimension and 2D for angular dimension), LF cameras enable many attractive applications such as post-capture refocusing [3,4], depth sensing [5][6][7][8][9][10][11][12], virtual reality [13,14] and view rendering [15][16][17][18].…”
Section: Introductionmentioning
confidence: 99%