Bringing Old Films Back to Life

Wan, Ziyu; Zhang, Bo; Chen, Dongdong; Liao, Jing

doi:10.48550/arxiv.2203.17276

Cited by 1 publication

(2 citation statements)

References 47 publications

(69 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It becomes a recurrent model when N = 1 or a transformer model when N = T . This is fundamentally different from previous methods that adopt transformer blocks to replace CNN blocks within a recurrent architecture [77,43]. It is also different from existing attempts in natural language processing [81,33].…”

Section: Recurrent Feature Refinementmentioning

confidence: 69%

See 1 more Smart Citation

Recurrent Video Restoration Transformer with Guided Deformable Attention

Liang¹,

Yi²,

Xiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Video restoration aims at restoring multiple high-quality frames from multiple lowquality frames. Existing video restoration methods generally fall into two extreme cases, i.e., they either restore all frames in parallel or restore the video frame by frame in a recurrent way, which would result in different merits and drawbacks. Typically, the former has the advantage of temporal information fusion. However, it suffers from large model size and intensive memory consumption; the latter has a relatively small model size as it shares parameters across frames; however, it lacks long-range dependency modeling ability and parallelizability. In this paper, we attempt to integrate the advantages of the two cases by proposing a recurrent video restoration transformer, namely RVRT. RVRT processes local neighboring frames in parallel within a globally recurrent framework which can achieve a good trade-off between model size, effectiveness, and efficiency. Specifically, RVRT divides the video into multiple clips and uses the previously inferred clip feature to estimate the subsequent clip feature. Within each clip, different frame features are jointly updated with implicit feature aggregation. Across different clips, the guided deformable attention is designed for clip-to-clip alignment, which predicts multiple relevant locations from the whole inferred clip and aggregates their features by the attention mechanism. Extensive experiments on video super-resolution, deblurring, and denoising show that the proposed RVRT achieves state-of-the-art performance on benchmark datasets with balanced model size, testing memory and runtime. The codes are available at https://github.com/JingyunLiang/RVRT.Preprint. Under review.

show abstract

Section: Recurrent Feature Refinementmentioning

confidence: 69%

“…It aims to restore a clear and sharp high-quality video from a degraded (e.g., downsampled, blurred, or noisy) low-quality video [79,9,4,37]. It has wide applications in live streaming [96], video surveillance [48], old film restoration [77], and more.…”

Section: Introductionmentioning

confidence: 99%