In recent years, raw video denoising has garnered increased attention due to the consistency with the imaging process and well-studied noise modeling in the raw domain. Despite these advancements, two problems still hinder the denoising performance. Firstly, there is no large dataset with realistic motions for supervised raw video denoising, as capturing noisy and clean frames for real dynamic scenes is difficult. To address this, we propose recapturing existing high-resolution videos displayed on a 4K screen. Specifically, we recapture the screen content with high-low ISO settings to construct noisy-clean paired frames. Afterward, we introduce intensity, spatial, and color correction strategies to make the paired frames well-aligned. Then, the aligned frames are concatenated with temporal order to construct paired videos. In this way, we construct a video denoising dataset (named as ReCRVD) with 120 groups of noisy-clean videos, whose ISO values ranging from 1600 to 25600. Secondly, while non-local temporal-spatial attention is beneficial for denoising, it often leads to heavy computation costs. In this work, we propose an efficient raw video denoising transformer network (RViDeformer) that explores both short and long-distance correlations. Specifically, we introduce Low-Resolution-Window Self-Attention (LWSA), Global-Window Self-Attention (GWSA), and Neighbour-Window Self-Attention (NWSA) to build a multi-branch spatial self-attention for spatial reconstruction. Similarly, Global-Window Temporal Mutual Attention (GTMA) and Neighbour-Window Temporal Mutual Attention (NTMA) are proposed to build multi-branch temporal self-attention for temporal reconstruction. We employ reparameterization to reduce computation costs. Our network is trained in both supervised and unsupervised manners, achieving the best performance compared with state-of-the-art methods. Additionaly, the model trained with our proposed dataset (ReCRVD) outperforms the model trained with previous benchmark dataset (CRVD) when evaluated on the real-world outdoor noisy videos. Our code and dataset will be released after the acceptance of this work.
In this paper, we propose to introduce intrinsic image decomposition priors into decomposition models for contrast enhancement. Since image decomposition is a highly illposed problem, we introduce constraints on both reflectance and illumination layers to yield a highly reliable solution. We regularize the reflectance layer to be piecewise constant by introducing a weighted ℓ norm constraint on neighboring pixels according to the color similarity, so that the decomposed reflectance would not be affected much by the illumination information. The illumination layer is regularized by a piecewise smoothness constraint. The proposed model is effectively solved by the Split Bregman algorithm. Then, by adjusting the illumination layer, we obtain the enhancement result. To avoid potential color artifacts introduced by illumination adjusting and reduce computing complexity, the proposed decomposition model is performed on the value channel in HSV space. Experiment results demonstrate that the proposed method performs well for a wide variety of images, and achieves better or comparable subjective and objective quality compared with the state-of-the-art methods.
Single image denoising suffers from limited data collection within a noisy image. In this paper, we propose a novel image denoising scheme, which explores both internal and external correlations with the help of web images. For each noisy patch, we build internal and external data cubes by finding similar patches from the noisy and web images, respectively. We then propose reducing noise by a two-stage strategy using different filtering approaches. In the first stage, since the noisy patch may lead to inaccurate patch selection, we propose a graph based optimization method to improve patch matching accuracy in external denoising. The internal denoising is frequency truncation on internal cubes. By combining the internal and external denoising patches, we obtain a preliminary denoising result. In the second stage, we propose reducing noise by filtering of external and internal cubes, respectively, on transform domain. In this stage, the preliminary denoising result not only enhances the patch matching accuracy but also provides reliable estimates of filtering parameters. The final denoising image is obtained by fusing the external and internal filtering results. Experimental results show that our method constantly outperforms state-of-the-art denoising schemes in both subjective and objective quality measurements, e.g., it achieves >2 dB gain compared with BM3D at a wide range of noise levels.
Accurate and high-quality depth maps are required in lots of 3D applications, such as multi-view rendering, 3D reconstruction and 3DTV. However, the resolution of captured depth image is much lower than that of its corresponding color image, which affects its application performance. In this paper, we propose a novel depth map super-resolution (SR) method by taking view synthesis quality into account. The proposed approach mainly includes two technical contributions. First, since the captured low-resolution (LR) depth map may be corrupted by noise and occlusion, we propose a credibility based multi-view depth maps fusion strategy, which considers the view synthesis quality and interview correlation, to refine the LR depth map. Second, we propose a view synthesis quality based trilateral depth-map up-sampling method, which considers depth smoothness, texture similarity and view synthesis quality in the up-sampling filter. Experimental results demonstrate that the proposed method outperforms state-of-the-art depth SR methods for both super-resolved depth maps and synthesized views. Furthermore, the proposed method is robust to noise and achieves promising results under noise-corruption conditions.
This paper proposes a depth super-resolution method with both transform and spatial domain regularization. In the transform domain regularization, nonlocal correlations are exploited via an auto-regressive model, where each patch is further sparsified with a locally-trained transform to consider intra-patch correlations. In the spatial domain regularization, we propose a multi-directional total variation (MTV) prior to characterize the geometrical structures spatially orientated at arbitrary directions in depth maps. To achieve adaptive regularization, the MTV is weighted for each directional finite difference considering local characteristics of RGB-D data. We develop an accelerated proximal gradient algorithm to solve the proposed model. Quantitative and qualitative evaluations compared with state-of-the-art methods demonstrate that the proposed method achieves superior depth super-resolution performance for various configurations of magnification factors and datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.