Light-field images captured by light-field cameras usually suffer from low spatial resolution due to the inherent limited sensor resolution. Light-field spatial super-resolution thus becomes increasingly desirable for subsequent applications. Although continuous progress has been achieved, the existing methods still failed to thoroughly explore the coherence among light-field views. To address this issue, we propose an efficient neural network for light-field spatial super-resolution, in which the spatial and angular information can be fully exploited by repeatedly alternating spatial and angular domain. Specifically, an enhanced spatial-angular separable convolution block is proposed to efficiently exploit the correlation information between sub-aperture images. Moreover, a multi-scale feature extraction block is introduced to extract feature representations at different scales and capture rich texture and semantic information. Experimental results on both synthetic and real-world light-field datasets demonstrate that the proposed method outperforms other state-of-the-art methods with higher peak signal-tonoise ratio (PSNR)/structural similarity (SSIM) values and fewer parameters.
.Densely sampled light fields (LFs) are critical for their further applications, such as digital refocus and depth estimation. However, it is costly and time-consuming to capture them. LF reconstruction, which aims at reconstructing a densely sampled LF from a sparsely sampled one, has attracted extensive attention of researchers. Although existing methods have achieved significant progress, these methods synthesize novel views, either through depth estimation and image warping, which depend heavily on the accuracy of the depth maps and are prone to cause artifacts at occluded regions, or by stacking multi-layer convolutions to learn the inherent structure of the LF, which will result in blurring results due to limited receptive fields when processing scenes with large disparities. We propose a transformer-based neural network for LF reconstruction (termed as LFRTR). Specifically, two novel transformers are introduced, namely angular transformer and spatial transformer. The former can fully explore angular information and correlations among different views, whereas the latter can capture local and non-local spatial texture information within each view. Moreover, dense skip connections are employed to enhance information flow between different layers. Thanks to the inherent global modeling ability of self-attention, the proposed LFRTR can reconstruct high-quality densely sampled LF in complex scenarios, such as large disparity, occlusion, and reflection. Experimental results on both synthetic and real-world LF datasets show that the proposed LFRTR outperforms other state-of-the-art methods in terms of both visual and numerical evaluations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.