ConvMatch: Rethinking Network Design for Two-View Correspondence Learning

Zhang, Shihua; Ma, Jiayi

doi:10.1609/aaai.v37i3.25456

Cited by 6 publications

(2 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We evaluate our methods for two-view image matching on three datasets, including YFCC100M ( The performance is compared to NN with RT (Lowe 2004), learnable filter ConvMatch (Zhang and Ma 2023), and feature matching GNNs including SuperGlue (Sarlin et al 2020), SGMNet (Chen et al 2021), ParaFormer (Lu et al 2023a), and LightGlue (9 layers) (Lindenberger, Sarlin, and Pollefeys 2023). All GNNs are trained on GL3D.…”

Section: Methodsmentioning

confidence: 99%

“…PointCN (Yi et al 2018) takes an early effort to learn match filtering as a classification task. ConvMatch (Zhang and Ma 2023), an alternative, employs self-attention to model vector field consensus (Ma et al 2014), which shares a similar motivation with us. Instead of filtering putative sets, SuperGlue (Sarlin et al 2020) designs an attention-based GNN to match sparse features in a graph matching manner.…”

Section: Related Workmentioning

confidence: 98%

See 1 more Smart Citation

ResMatch: Residual Attention Learning for Feature Matching

Deng,

Zhang,

Zhang

et al. 2024

AAAI

View full text Add to dashboard Cite

Attention-based graph neural networks have made great progress in feature matching. However, the literature lacks a comprehensive understanding of how the attention mechanism operates for feature matching. In this paper, we rethink cross- and self-attention from the viewpoint of traditional feature matching and filtering. To facilitate the learning of matching and filtering, we incorporate the similarity of descriptors into cross-attention and relative positions into self-attention. In this way, the attention can concentrate on learning residual matching and filtering functions with reference to the basic functions of measuring visual and spatial correlation. Moreover, we leverage descriptor similarity and relative positions to extract inter- and intra-neighbors. Then sparse attention for each point can be performed only within its neighborhoods to acquire higher computation efficiency. Extensive experiments, including feature matching, pose estimation and visual localization, confirm the superiority of the proposed method. Our codes are available at https://github.com/ACuOoOoO/ResMatch.

show abstract