2021
DOI: 10.48550/arxiv.2111.15483
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation

Abstract: Video frame interpolation (VFI) is currently a very active research topic, with applications spanning computer vision, post production and video encoding. VFI can be extremely challenging, particularly in sequences containing large motions, occlusions or dynamic textures, where existing approaches fail to offer perceptually robust interpolation performance. In this context, we present a novel deep learning based VFI method, ST-MFNet, based on a Spatio-Temporal Multi-Flow architecture. ST-MFNet employs a new mu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 39 publications
(101 reference statements)
0
7
0
Order By: Relevance
“…Specifically, we computed four features for each sequence: spatial information (SI), temporal information (TI), motion vector (MV) and dynamic texture parameter (DTP). The latter two were included because motion magnitude and complexity have direct impact on VFI [3]. The calculation of SI, TI and DTP can be found in [19] and MV is described in [20].…”
Section: Reference Sequence Selectionmentioning
confidence: 99%
See 3 more Smart Citations
“…Specifically, we computed four features for each sequence: spatial information (SI), temporal information (TI), motion vector (MV) and dynamic texture parameter (DTP). The latter two were included because motion magnitude and complexity have direct impact on VFI [3]. The calculation of SI, TI and DTP can be found in [19] and MV is described in [20].…”
Section: Reference Sequence Selectionmentioning
confidence: 99%
“…The dropped frames were then reconstructed using five different VFI methods using their neighbouring frames. These include frame repeating, frame averaging (from two adjacent frames), DVF [1], [2] and ST-MFNet [3]. The first two were included because they have very low computational complexity and produce unique artifact types, juddering and blurring respectively.…”
Section: Test Sequence Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…It is noted that most of the existing DefConv-based methods use only two adjacent frames to interpolate the middle frame, whereas the use of a larger window can provide richer information on the spatio-temporal characteristics of the input signal [13,16]. Moreover, when predicting multi-flows under the DefConv-based VFI framework, only 2D convolutional neural networks (CNNs) have been employed, which cannot provide an explicit way of extracting temporal information from the content.…”
Section: Introductionmentioning
confidence: 99%