This paper proposes extensions of CALIC for lossless compression of light field (LF) images. The overall prediction process is improved by exploiting the linear structure of Epipolar Plane Images (EPI) in a slope based prediction scheme. The prediction is improved further by averaging predictions made using horizontal and verticals EPIs. Besides this, the difference in these predictions is included in the error energy function, and the texture context is redefined to improve the overall compression ratio. The results using the proposed method shows significant bitrate-savings in comparison to standard lossless coding schemes and offers significant reduction in computational complexity in comparison to the state-of-the-art compression schemes.
Light field imaging is becoming a key technology, which provides users with a realistic visual experience through the capability of dynamic viewpoint shifting. This ability comes at the cost of capturing huge amounts of information, leaving the problem of compression and transmission a challenge. The encoder complexity is the key to achieve efficient coding in conventional light field coding schemes, where a complicated prediction process is essentially used at the encoder side to exploit the redundancy present in the light field image. We employ Distributed Source Coding (DSC) for light field images, which can extensively lift the computational requirement from the encoding side at the expense of increased computational complexity at the decoder side. The efficiency of DSC is heavily dependent on the quality of side information at the decoder. Therefore, we propose to leverage a learningbased view synthesis method, which takes into account the light field structure to generate high-quality side information. We compare our approach to Distributed Video Coding and Distributed Multi-view Video Coding schemes adapted to the light field framework and relevant standard-based approach, and demonstrate that the proposed view synthesis-based approach can achieve similar performance, while substantially reducing the number of key views to be transmitted.
Light field cameras enable new capabilities, such as post-capture refocusing and aperture control, through capturing directional and spatial distribution of light rays in space. Microlens array based light field camera design is often preferred due to its light transmission efficiency, cost-effectiveness and compactness. One drawback of the micro-lens array based light field cameras is low spatial resolution due to the fact that a single sensor is shared to capture both spatial and angular information. To address the low spatial resolution issue, we present a light field imaging approach, where multiple light fields are captured and fused to improve the spatial resolution. For each capture, the light field sensor is shifted by a pre-determined fraction of a microlens size using an xy translation stage for optimal performance.
Consumer light-field (LF) cameras suffer from a low or limited resolution because of the angular-spatial trade-off. To alleviate this drawback, we propose a novel learning-based approach utilizing attention mechanism to synthesize novel views of a light-field image using a sparse set of input views (i.e., 4 corner views) from a camera array. In the proposed method, we divide the process into three stages, stereo-feature extraction, disparity estimation, and final image refinement. We use three sequential convolutional neural networks for each stage. A residual convolutional block attention module (CBAM) is employed for final adaptive image refinement. Attention modules are helpful in learning and focusing more on the important features of the image and are thus sequentially applied in the channel and spatial dimensions. Experimental results show the robustness of the proposed method. Our proposed network outperforms the stateof-the-art learning-based light-field view synthesis methods on two challenging real-world datasets by 0.5 dB on average. Furthermore, we provide an ablation study to substantiate our findings.
Light fields enable increasing the degree of realism and immersion of visual experience by capturing a scene with a higher number of dimensions than conventional 2D imaging. On another side, higher dimensionality entails significant storage and transmission overhead compared to traditional video. Conventional coding schemes achieve high coding gains by employing an asymmetric codec design, where the encoder is significantly more complex than the decoder. However, in the case of light fields, the communication and processing among different cameras could be expensive, and the possibility of trading the complexity between the encoder and the decoder becomes a desirable feature. We leverage the distributed source coding paradigm to effectively reduce the encoder's complexity at the cost of increased computation at the decoder side. Specifically, we train two deep neural networks to improve the two most critical parts of a distributed source coding scheme: the prediction of side information and the estimation of the uncertainty in the prediction. Experiments show considerable BD-rate gains, above 59% over HEVC-Intra and 17.45% over our previous method DLFC-I.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.