LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation

Shao, Ruizhi; Wu, Gaochang; Zhou, Yuemei; Fu, Ying; Liu, Yebin

doi:10.48550/arxiv.2106.04067

Cited by 1 publication

(4 citation statements)

References 43 publications

(124 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep Homography Estimation Deep homography estimation can be categorized into supervised and unsupervised methods. Supervised methods [8,18,28] learn from image pairs with ground truth homographies, which are difficult to obtain for natural images in the wild. If learning from synthetic images, the lack of realistic transformation will degrade their generalization ability.…”

Section: Related Workmentioning

confidence: 99%

“…[39] and [18] introduced mask prediction into homography estimation, but their goal is to remove large foregrounds or moving objects, while our goal is to preserve a single dominant plane with explicit constraint. Recently, Shao et al [28] proposed a supervised transformer for cross-resolution homography estimation. However, aiming at different tasks, our architecture designs are also different, where they propose a transformer with local attention, while ours contains a self-attention encoder and class-attention decoder.…”

Section: Related Workmentioning

confidence: 99%

“…Homography estimation is a fundamental computer vision problem that plays an important role in a wide range of applications, such as image/video stitching [14,37], camera calibration [40], HDR imaging [12] and SLAM [24,41]. It is defined as the estimation of the projective transformation between two views on the same plane in 3D space [28]. Traditional methods typically address this problem by following a pipeline of feature extraction [2,22,26], correspondence matching, and solving direct linear transform [15] with outlier rejection [10].…”

Section: Introductionmentioning

confidence: 99%

“…Recently, unsupervised learning methods have gained popularity in homography estimation [8,25,28,35,39]. These methods directly predict the homography from a pair of source and target images using a neural network, of which an important optimization objective is to minimize the distance from the warped source image to the target image.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Unsupervised Homography Estimation with Coplanarity-Aware GAN

Hong¹,

Lu²,

Ye³

et al. 2022

Preprint

View full text Add to dashboard Cite

Estimating homography from an image pair is a fundamental problem in image alignment. Unsupervised learning methods have received increasing attention in this field due to their promising performance and label-free training. However, existing methods do not explicitly consider the problem of plane-induced parallax, which will make the predicted homography compromised on multiple planes. In this work, we propose a novel method HomoGAN to guide unsupervised homography estimation to focus on the dominant plane. First, a multi-scale transformer network is designed to predict homography from the feature pyramids of input images in a coarse-to-fine fashion. Moreover, we propose an unsupervised GAN to impose coplanarity constraint on the predicted homography, which is realized by using a generator to predict a mask of aligned regions, and then a discriminator to check if two masked feature maps are induced by a single homography. To validate the effectiveness of HomoGAN and its components, we conduct extensive experiments on a large-scale dataset, and results show that our matching error is 22% lower than the previous SOTA method. Code is available at https: //github.com/megvii-research/HomoGAN .

show abstract

Section: Related Workmentioning

confidence: 99%