2020
DOI: 10.1609/aaai.v34i07.6634
|View full text |Cite
|
Sign up to set email alerts
|

Video Frame Interpolation via Deformable Separable Convolution

Abstract: Learning to synthesize non-existing frames from the original consecutive video frames is a challenging task. Recent kernel-based interpolation methods predict pixels with a single convolution process to replace the dependency of optical flow. However, when scene motion is larger than the pre-defined kernel size, these methods yield poor results even though they take thousands of neighboring pixels into account. To solve this problem in this paper, we propose to use deformable separable convolution (DSepConv) t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
60
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 100 publications
(66 citation statements)
references
References 27 publications
0
60
0
Order By: Relevance
“…e convolutional layer is the core of the network, and most calculations are performed in the convolutional layer. e feature map is generated in the convolution operation and output to the next layer for feature extraction [13]. In the convolution operation, the convolution kernel learns the best parameters for extracting features through iterations.…”
Section: Convolutional Layermentioning
confidence: 99%
“…e convolutional layer is the core of the network, and most calculations are performed in the convolutional layer. e feature map is generated in the convolution operation and output to the next layer for feature extraction [13]. In the convolution operation, the convolution kernel learns the best parameters for extracting features through iterations.…”
Section: Convolutional Layermentioning
confidence: 99%
“…Motion approximation for backward warping: Conventional algorithms [2,3,16,34,36] approximate the motion fields V t→0 and V t→1 in (6). For example, the flow projection in [2,3] approximates V t→0 and V t→1 by aggregating multiple flow vectors between I 0 and I 1 , which pass near each pixel in I t .…”
Section: Motion-based Frame Warpingmentioning
confidence: 99%
“…Thus, we adopt a lightweight optical flow network [31] on LR frames and a flow refine network [26] to get the middle flow on HR frames, and we try a new supervised flow loss to achieve better perception. Recently, meta-learning is also introduced into frame interpolation [7]; CAIN [8] adapts channel attention into VFI; and EDSC [6] uses ConvLSTM to learn motion offset for implicit motion compensation.…”
Section: Video Frame Interpolation (Vfi)mentioning
confidence: 99%