2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.304
|View full text |Cite
|
Sign up to set email alerts
|

Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Abstract: Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video superresolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal subpixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
577
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 634 publications
(578 citation statements)
references
References 32 publications
1
577
0
Order By: Relevance
“…Super-Resolution Single image super resolution (SISR) methods upscale individual images, which include traditional methods based on bilinear and bicubic filters and recently-introduced learning-based techniques [26], [28]. SISR methods can be easily extended to support video or multi-frame super-resolution [10], [22], [34]. In the case of multi-camera setup, super-resolution can be performed using some source as a reference [6], [14].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Super-Resolution Single image super resolution (SISR) methods upscale individual images, which include traditional methods based on bilinear and bicubic filters and recently-introduced learning-based techniques [26], [28]. SISR methods can be easily extended to support video or multi-frame super-resolution [10], [22], [34]. In the case of multi-camera setup, super-resolution can be performed using some source as a reference [6], [14].…”
Section: Related Workmentioning
confidence: 99%
“…• Step 3: End-to-End Joint Training. Starting with our pretrained models, we jointly train FlowNet and FusionNet by minimizing the same end-to-end 1 loss in (10). In this step, we set learning rate to 10 −5 for FusionNet and 3×10 −6 for FlowNet over 100k iterations.…”
Section: Training Strategy and Loss Functionmentioning
confidence: 99%
“…Between every two consecutive groups, max pooling is used to reduce the spatial resolution of the feature map by half, giving rise to feature maps with 4 different scales. Inspired from [1], our network progressively fuses the slices in the input 3D patch by not performing the convolution operation in the 2 outmost slices in every 3D convolution layer because these two slices are of least relevance to the central slice. We choose T to be the number of 3D convolution layers so that there exists only one slice (the central slice) in the final group of feature maps, E k .…”
Section: Progressive Fusion Networkmentioning
confidence: 99%
“…Then, we compare our approach with current state-of-the-art methods on the standard Vid4 benchmark dataset (Liu and Sun 2011) in terms of visual quality, objective metric, temporal consistency, and computational cost. Following (Caballero et al 2017), the evaluation metrics of Peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are computed on the brightness channel on the 4-video dataset Vid4. Thirdly, we detail the suppression-updating algorithm for depressing iteration error of high-frequency information.…”
Section: Experiments and Analysesmentioning
confidence: 99%
“…To evaluate the performance of our approach on real-world data, following (Caballero et al 2017), a visual comparison result is reported in Fig. 8.…”
Section: Realworld Examplesmentioning
confidence: 99%