Self-Supervised Real-time Video Stabilization

Choi, Jinsoo; Park, Jaesik; Kweon, In So

doi:10.48550/arxiv.2111.05980

Cited by 1 publication

(1 citation statement)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To tackle the problem of video stabilization in dynamic scenes, a dense warping field as scene representation was trained from consecutive video frames by Liu et al [38], then the warped parts are blended to synthesize the stabilized image. The framework by J. Choi et al [37] even made use of a motion prediction model based on optical flow tracking. However, all those models involve a lot of processing overhead and are quite heavy to deploy on an edge device in real time.…”

Section: Related Workmentioning

confidence: 99%

Adaptive Sampling-based Particle Filter for Visual-inertial Gimbal in the Wild

Kang

Ariel

Lema

et al. 2023

2023 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

In this paper, we present a Computer Vision (CV) based tracking and fusion algorithm, dedicated to a 3D printed gimbal system on drones flying in nature. The whole gimbal system can stabilize the camera orientation robustly in challenging environments by using skyline and ground plane as references. Our main contributions are the following: a) a light-weight Resnet-18 backbone network model was trained from scratch, and deployed onto the Jetson Nano platform to segment the image specifically into binary parts (ground and sky); b) our geometry assumption from the skyline and ground cues delivers the potential for robust visual tracking in the wild by using the skyline and ground plane as references; c) a manifold surface-based adaptive particle sampling can fuse orientation from multiple sensor sources flexibly. The whole algorithm pipeline is tested on our 3Dprinted gimbal module with Jetson Nano. The experiments were performed on top of a building in a real landscape. The public code link: https://github.com/alexandor91/gimbalfusion.git.• A lightweight binary segmentation model is trained to label the ground and sky pixels specifically, aiming for real-time inference on the embedded device.

show abstract

Section: Related Workmentioning

confidence: 99%