FlowFormer: A Transformer Architecture for Optical Flow

Huang, Zhaoyang; Shi, Xiaoyu; Zhang, Chao; Wang, Qiang; Cheung, Ka Chun; Qin, Hongwei; Dai, Jifeng; Li, Hongsheng

doi:10.1007/978-3-031-19790-1_40

Cited by 85 publications

(42 citation statements)

References 81 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• We demonstrate the effectiveness of DistractFlow in supervised [6,14,30] and semi-supervised settings, showing that DistractFlow outperforms the very recent FlowSupervisor [9] that require additional in-domain unlabeled data.…”

Section: Introductionmentioning

confidence: 91%

“…Optical Flow Estimation: Several deep architectures have been proposed for optical flow [4,8,23,29,30,38]. Among these, Recurrent All Pairs Field Transforms (RAFT) [30] have shown significant performance improvement over previous methods, inspiring many subsequent works [6,14,26,27,35]. Following the structure of RAFT architecture, complementary studies [12,14,33,35,39] proposed advancements on feature extraction, 4D correlation volume, recurrent update blocks, and more recently, transformer extensions [6,39].…”

Section: Related Workmentioning

confidence: 99%

“…Among these, Recurrent All Pairs Field Transforms (RAFT) [30] have shown significant performance improvement over previous methods, inspiring many subsequent works [6,14,26,27,35]. Following the structure of RAFT architecture, complementary studies [12,14,33,35,39] proposed advancements on feature extraction, 4D correlation volume, recurrent update blocks, and more recently, transformer extensions [6,39]. In DistractFlow, we introduce a new modelagnostic training method that can help any model.…”

Section: Related Workmentioning

confidence: 99%

“…Recent years have seen significant progress in optical flow estimation thanks to the development of deep learning, e.g., [4,7,8,23]. Among the latest works, many focus on developing novel neural network architectures, such as PWC-Net [29], RAFT [30], and FlowFormer [6]. Other stud- † Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc. ies investigate how to improve different aspects of supervised training [27], e.g., gradient clipping, learning rate, and training compute load.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Imposing Consistency for Optical Flow Estimation

Jeong

Lin

Porikli

et al. 2022

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

We propose a novel data augmentation approach, Dis-tractFlow, for training optical flow estimation models by introducing realistic distractions to the input frames. Based on a mixing ratio, we combine one of the frames in the pair with a distractor image depicting a similar domain, which allows for inducing visual perturbations congruent with natural objects and scenes. We refer to such pairs as distracted pairs. Our intuition is that using semantically meaningful distractors enables the model to learn related variations and attain robustness against challenging deviations, compared to conventional augmentation schemes focusing only on low-level aspects and modifications. More specifically, in addition to the supervised loss computed between the estimated flow for the original pair and its ground-truth flow, we include a second supervised loss defined between the distracted pair's flow and the original pair's ground-truth flow, weighted with the same mixing ratio. Furthermore, when unlabeled data is available, we extend our augmentation approach to self-supervised settings through pseudo-labeling and cross-consistency regularization. Given an original pair and its distracted version, we enforce the estimated flow on the distracted pair to agree with the flow of the original pair. Our approach allows increasing the number of available training pairs significantly without requiring additional annotations. It is agnostic to the model architecture and can be applied to training any optical flow estimation models. Our extensive evaluations on multiple benchmarks, including Sintel, KITTI, and SlowFlow, show that DistractFlow improves existing models consistently, outperforming the latest state of the art.

show abstract

Section: Introductionmentioning

confidence: 91%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Imposing Consistency for Optical Flow Estimation

Jeong

Lin

Porikli

et al. 2022

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

show abstract

“…Attempts have been made to develop effective VFI methods [1,2,6,[9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. Especially, with the advances in optical flow estimation [28][29][30][31][32][33][34][35][36][37], motion-based VFI methods provide remarkable performances. But, VFI for high-resolution videos, e.g.…”

Section: Introductionmentioning

confidence: 99%

Asymmetric Bilateral Motion Estimation for Video Frame Interpolation

Park¹,

Lee

Kim³

2021

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

103

View full text Add to dashboard Cite

Ground-Truth XVFI ABME BiFormer (Proposed) 233px 163px 𝒱 𝑡→1 𝒱 𝑡→1 𝒱 𝑡→1 23.67dB / 0.8137 26.64dB / 0.8634 20.21dB / 0.7516 Blended Input Figure 1. Examples of 4K video frame interpolation results, obtained by ABME [1], XVFI [2], and the proposed BiFormer. 4K video frame interpolation is challenging due to large motion magnitudes, e.g. hundreds of pixels. PSNR/SSIM scores are presented within the interpolation results, and the estimated motion fields Vt→1 are at the bottom row.

show abstract