Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Xu, Gang; Xu, Jun; Li, Zhen; Wang, Liang; Sun, Xing; Cheng, Ming–Ming

doi:10.1109/cvpr46437.2021.00632

Cited by 77 publications

(75 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, these VSR models would only produce pre-defined intermediate frames, causing them constrained to highly-controlled scenarios with fixed frame-rate videos. Consequently, exploiting controllable spatio-temporal VSR approaches, which with the deformable convolution network, for smooth motion synthesizing it is necessary (Xu et al 2021).…”

Section: Discussionmentioning

confidence: 99%

Optical Flow for Video Super-Resolution: A Survey

Tu¹,

Li²,

Xie³

et al. 2022

Preprint

View full text Add to dashboard Cite

Video super-resolution is currently one of the most active research topics in computer vision as it plays an important role in many visual applications. Generally, video super-resolution contains a significant component, i.e., motion compensation, which is used to estimate the displacement between successive video frames for temporal alignment. Optical flow, which can supply dense and sub-pixel motion between consecutive frames, is among the most common ways for this task. To obtain a good understanding of the effect that optical flow acts in video super-resolution, in this work, we conduct a comprehensive review on this subject for the first time. This investigation covers the following major topics: the function of super-resolution (i.e., why we require super-resolution); the concept of video super-resolution (i.e., what is video super-resolution); the description of evaluation metrics (i.e., how (video) superresolution performs); the introduction of optical flow based video super-resolution; the investigation of using optical flow to capture temporal dependency for video super-resolution. Prominently, we give an in-depth study of the deep learning based video super-resolution method, where some representative algorithms are analyzed

show abstract

Section: Discussionmentioning

confidence: 99%

Optical Flow for Video Super-Resolution: A Survey

Tu¹,

Li²,

Xie³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Zooming Slow-Mo [5] developed a unified framework with deformable ConvLSTM to align and aggregate temporal information and then synthesize the intermediate features by a bidirectional recurrent network before performing feature fusion for STVSR. Based on Zooming Slow-Mo, xu et al [7] proposed a temporal modulation network via locallytemporal feature comparison module and deformable convolution kernels for controllable feature interpolation, which can interpolate arbitrary intermediate frames.…”

Section: Video Time-space Super-resolutionmentioning

confidence: 99%

“…For the two-stage methods, we perform video frame interpolation (VFI) by SuperSloMo [15], [46] or SepConv [24], and perform video super-resolution (VSR) by Bicubic Interpolation (BI), RCAN [47], RBPN [9] or EDVR [8]. For one-stage STVSR models, we compare our network with recently state-of-the-art methods Zooming SlowMo [5] ,STARnet [6] or TMnet [7]. When training, we use Vimeo-90K trainset [42] and feed odd LR frames into the model and reconstruct HR frames corresponding to the frames of the entire sequence.…”

Section: B Comparison With State-of-the-artsmentioning

confidence: 99%

“…1: Comparation of accuracy(PSNR) and speed(FPS) of different methods on Vid4 [4] dataset. Our method is faster and more accurate than other state-of-the-art methods, such as Zooming [5],STARnet [6],TMNet [7], while maintaining a relatively small amount of parameters.…”

Section: Introductionmentioning

confidence: 98%

“…Compared with STARnet, this method supports relatively longer input video sequences and meanwhile costs less on computation and memory. Based on Zooming Slow-Mo, Xu et al further proposed TMNet [7] which can perform controllable frame interpolation at any intermediate moment. After serious research and Reflection on the existing work, we design a bidirectional recurrent network for ST-VSR that can make better use of local information and global information with high efficiency.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Optical Flow Reusing for High-Efficiency Space-Time Video Super Resolution

Wang¹,

Chen

2021

Preprint

View full text Add to dashboard Cite

In this paper, we consider the task of space-time video super-resolution (ST-VSR), which simultaneously increases the spatial resolution and frame rate for a given video. However, existing methods typically suffer from the difficulties in how to efficiently leverage information from a large range of neighboring frames or avoiding the speed degradation in the inference using deformable ConvLSTM strategies for alignment. To solve the above problem of the existing methods, we propose a coarseto-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames. Specifically, we first use bi-directional optical flow to update the hidden state and then employ a Feature Refinement Module (FRM) to refine the result. Since we could fully utilize a large range of neighboring frames, our method leverages local and global information more effectively. In addition, we propose a optical flow-reuse strategy that can reuse the intermediate flow of adjacent frames, which considerably reduces the computation burden of frame alignment compared with existing LSTM-based designs. Extensive experiments demonstrate that our opticalflow-reuse-based bidirectional recurrent network(OFR-BRN) is superior to the state-of-the-art methods both in terms of accuracy and efficiency.

show abstract