2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.479
|View full text |Cite
|
Sign up to set email alerts
|

Detail-Revealing Deep Video Super-Resolution

Abstract: Previous CNN-based video super-resolution approaches need to align multiple frames to the reference. In this paper, we show that proper frame alignment and motion compensation is crucial for achieving high quality results. We accordingly propose a "sub-pixel motion compensation" (SPMC) layer in a CNN framework. Analysis and experiments show the suitability of this layer in video SR. The final end-to-end, scalable CNN framework effectively incorporates the SPMC layer and fuses multiple frames to reveal image de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
486
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 480 publications
(487 citation statements)
references
References 26 publications
1
486
0
Order By: Relevance
“…In addition to compression artifact removal, spatiotemporal correlation mining is also a hot topic in other video quality enhancement tasks, such as video super resolution (VSR). [4,18,19,23,32,37,42] estimated optical flow and warped several frames to capture the hidden spa- tiotemporal dependency for VSR. Although these methods work well, they rely heavily on the accuracy of motion estimation.…”
Section: Video Compression Artifact Reductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition to compression artifact removal, spatiotemporal correlation mining is also a hot topic in other video quality enhancement tasks, such as video super resolution (VSR). [4,18,19,23,32,37,42] estimated optical flow and warped several frames to capture the hidden spa- tiotemporal dependency for VSR. Although these methods work well, they rely heavily on the accuracy of motion estimation.…”
Section: Video Compression Artifact Reductionmentioning
confidence: 99%
“…Different from ConvLSTM in [37,41] that is fed with only feature F t at time t, NL-ConvLSTM takes additional feature F t−1 at time (t-1) as input, and outputs the corresponding hidden state and cell state H t , C t ∈ R C h ×N . Here, C h is the number of channels of hidden state and cell state.…”
Section: The Frameworkmentioning
confidence: 99%
“…They use a multi-scale spatial transformer to warp the LR frame and eventually generate an HR frame through another deep network. Tao et al [20] proposed a sub-pixel motion compensation layer for frame alignment and used a convolution LSTM architecture in following SR reconstruction network.…”
Section: Video Super-resolutionmentioning
confidence: 99%
“…Due to the motion of the camera or object, the neighboring frames should be spatially aligned first so as to utilize the information and extract missing details from them. To this end, the traditional VSR methods [16,20,18,1] usually calculate the optical flow and estimate the sub-pixel motion between LR frames to warp the neighboring frames and achieve the alignment operation. However, fast and reliable flow estimation still remains a challenging problem.…”
Section: Introductionmentioning
confidence: 99%
“…Similar to (Kappeler et al 2016b), (Caballero et al 2017) uses a trainable motion compensation network to replace the optical flow method in (Kappeler et al 2016b). Following this fashion, Tao et al (Tao et al 2017) propose a network comprising motion estimation, motion compensation, and detail fusion to process a batch of LR frames and output HR estimate. Different from the above mentioned approaches, (Sajjadi, Vemulapalli, and Brown 2018) proposes a frame recurrent video super-resolution (FRVSR) framework that combines the previous HR estimates to generate subsequent frame.…”
Section: Related Workmentioning
confidence: 99%