2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00237
|View full text |Cite
|
Sign up to set email alerts
|

Supervised Raw Video Denoising With a Benchmark Dataset on Dynamic Scenes

Abstract: In recent years, raw video denoising has garnered increased attention due to the consistency with the imaging process and well-studied noise modeling in the raw domain. Despite these advancements, two problems still hinder the denoising performance. Firstly, there is no large dataset with realistic motions for supervised raw video denoising, as capturing noisy and clean frames for real dynamic scenes is difficult. To address this, we propose recapturing existing high-resolution videos displayed on a 4K screen.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
118
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 96 publications
(118 citation statements)
references
References 81 publications
0
118
0
Order By: Relevance
“…In 2017, Plötz and Roth [38] established a real raw image denoising benchmark, which showed that these deep denoisers failed to generalize beyond the synthetic data used during training and were outperformed by standard non-learned methods, such as BM3D [11]. Subsequent work on both single [5,9] and multi-image [8,18,36,49] denoising demonstrated the benefits of training networks to operate directly on noisy raw input data. Modern cellphone camera pipelines perform a robust averaging of multiple noisy input frames in the raw domain [20], though they typically cannot afford to employ deep networks due to speed and power limitations.…”
Section: Denoisingmentioning
confidence: 99%
See 1 more Smart Citation
“…In 2017, Plötz and Roth [38] established a real raw image denoising benchmark, which showed that these deep denoisers failed to generalize beyond the synthetic data used during training and were outperformed by standard non-learned methods, such as BM3D [11]. Subsequent work on both single [5,9] and multi-image [8,18,36,49] denoising demonstrated the benefits of training networks to operate directly on noisy raw input data. Modern cellphone camera pipelines perform a robust averaging of multiple noisy input frames in the raw domain [20], though they typically cannot afford to employ deep networks due to speed and power limitations.…”
Section: Denoisingmentioning
confidence: 99%
“…Recent years have seen an increasing focus on developing deep learning methods for denoising images directly in the raw linear domain [5,9]. This effort has expanded to include multi-image denoisers that can be applied to burst images or video frames [8,43,49]. These multi-image denoisers typically assume that there is a relatively small amount of motion between frames, but that there may be large amounts of object motion within the scene.…”
Section: Denoisingmentioning
confidence: 99%
“…al. [7] have proposed an technique for video denoising using supervised learning approach. Authors have attempted to rectify the critical issues by designing motions for controllabel objects.…”
Section: Figure 2 Change In Psnr Value With and Without Compressionmentioning
confidence: 99%
“…The problem of spatially variant motion blur, related to independently moving objects in the presence of noise, has not yet been addressed in the deep learning literature. Not only is it an intrinsically challenging problem, but relevant research is also limited by the difficulty in constructing such labeled datasets [17], [18]. For instance, [12] used a beam splitter to construct real blurry-sharp frames, whereas [18] emulated motion by manually moving objects and used a fixed tripod to capture multiple frames of the same scene before generating real noisy-clean pairs of frames by averaging the noisy instantiations of the same scene.…”
Section: Introductionmentioning
confidence: 99%
“…Not only is it an intrinsically challenging problem, but relevant research is also limited by the difficulty in constructing such labeled datasets [17], [18]. For instance, [12] used a beam splitter to construct real blurry-sharp frames, whereas [18] emulated motion by manually moving objects and used a fixed tripod to capture multiple frames of the same scene before generating real noisy-clean pairs of frames by averaging the noisy instantiations of the same scene. In this study, we rely on the realistic blurry dataset of [12] and the realistic Poisson-Gaussian noise model [18], [19].…”
Section: Introductionmentioning
confidence: 99%