BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

Chan, Kelvin C. K.; Wang, Xintao; Yu, Ke; Dong, Chao; Loy, Chen Change

doi:10.1109/cvpr46437.2021.00491

Cited by 268 publications

(138 citation statements)

References 25 publications

Supporting

Mentioning

138

Contrasting

Order By: Relevance

“…For example, designing a VSR which can simultaneously estimate unknown blur kernels, motion fields, and latent HR videos effectively is prospective. Lots of VSR models have been proposed, but there is not a unified framework being dominant for VSR in practice yet (Chan et al 2021;Yi et al 2021). Research Topic 3: Exploring a generic, efficient, and easy-to-implement baseline framework for VSR, which can serve as a standard for various comparison and evaluation.…”

Section: Discussionmentioning

confidence: 99%

Optical Flow for Video Super-Resolution: A Survey

Tu¹,

Li²,

Xie³

et al. 2022

Preprint

View full text Add to dashboard Cite

Video super-resolution is currently one of the most active research topics in computer vision as it plays an important role in many visual applications. Generally, video super-resolution contains a significant component, i.e., motion compensation, which is used to estimate the displacement between successive video frames for temporal alignment. Optical flow, which can supply dense and sub-pixel motion between consecutive frames, is among the most common ways for this task. To obtain a good understanding of the effect that optical flow acts in video super-resolution, in this work, we conduct a comprehensive review on this subject for the first time. This investigation covers the following major topics: the function of super-resolution (i.e., why we require super-resolution); the concept of video super-resolution (i.e., what is video super-resolution); the description of evaluation metrics (i.e., how (video) superresolution performs); the introduction of optical flow based video super-resolution; the investigation of using optical flow to capture temporal dependency for video super-resolution. Prominently, we give an in-depth study of the deep learning based video super-resolution method, where some representative algorithms are analyzed

show abstract

Section: Discussionmentioning

confidence: 99%

Optical Flow for Video Super-Resolution: A Survey

Tu¹,

Li²,

Xie³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…b) Synthetic Frames for Video Super-Resolution: We illustrate that the interpolated frames by our model also benefit video super-resolution methods. A well-trained recurrent framework BasicVSR [56] is adopted due to its strong performance as well as the flexibility to allow an arbitrary number of input frames. In the basic setting, we take three consecutive low-resolution frames as input:…”

Section: D) Census Transformmentioning

confidence: 99%

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

Zhou¹,

Li²,

Han³

et al. 2022

Preprint

View full text Add to dashboard Cite

For video frame interpolation (VFI), existing deeplearning-based approaches strongly rely on the ground-truth (GT) intermediate frames, which sometimes ignore the nonunique nature of motion judging from the given adjacent frames. As a result, these methods tend to produce averaged solutions that are not clear enough. To alleviate this issue, we propose to relax the requirement of reconstructing an intermediate frame as close to the GT as possible. Towards this end, we develop a texture consistency loss (TCL) upon the assumption that the interpolated content should maintain similar structures with their counterparts in the given frames. Predictions satisfying this constraint are encouraged, though they may differ from the predefined GT. Without the bells and whistles, our plug-and-play TCL is capable of improving the performance of existing VFI frameworks. On the other hand, previous methods usually adopt the cost volume or correlation map to achieve more accurate image/feature warping. However, the O(N 2 ) (N refers to the pixel count) computational complexity makes it infeasible for highresolution cases. In this work, we design a simple, efficient (O(N )) yet powerful cross-scale pyramid alignment (CSPA) module, where multi-scale information is highly exploited. Extensive experiments justify the efficiency and effectiveness of the proposed strategy.Compared to state-of-the-art VFI algorithms, our method boosts the PSNR performance by 0.66dB on the Vimeo-Triplets dataset and 1.31dB on the Vimeo90K-7f dataset. In addition, our method is easily extended to the video frame extrapolation task. Surprisingly, our extrapolation model has achieved a 0.91dB PSNR gain over FLAVR under the same experimental setting, while being 2× times smaller in terms of the model size. At last, we show that our high-quality interpolated frames are also beneficial to the development of the video super-resolution task.

show abstract

“…Jo et al [32] propose a super-resolution method based on learned upsampling kernels rather than direct prediction. Chan et al [33] analyze critical components of superresolution and use this to design a simple, flexible architecture. Relevant here, Li et al [34] consider super-resolution with common video compression settings.…”

Section: Prior Work Jpeg Artifact Correctionmentioning

confidence: 99%

Leveraging Bitstream Metadata for Fast and Accurate Video Compression Correction

Ehrlich¹,

Barker²,

Namitha³

et al. 2022

Preprint

View full text Add to dashboard Cite

Video compression is a central feature of the modern internet powering technologies from social media to video conferencing. While video compression continues to mature, for many, and particularly for extreme, compression settings, quality loss is still noticeable. These extreme settings nevertheless have important applications to the efficient transmission of videos over bandwidth constrained or otherwise unstable connections. In this work, we develop a deep learning architecture capable of restoring detail to compressed videos which leverages the underlying structure and motion information embedded in the video bitstream. We show that this improves restoration accuracy compared to prior compression correction methods and is competitive when compared with recent deep-learning-based video compression methods on rate-distortion while achieving higher throughput.

show abstract

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

Cited by 268 publications

References 25 publications

Optical Flow for Video Super-Resolution: A Survey

Optical Flow for Video Super-Resolution: A Survey

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

Leveraging Bitstream Metadata for Fast and Accurate Video Compression Correction

Contact Info

Product

Resources

About