2022
DOI: 10.1007/978-3-031-20047-2_6
|View full text |Cite
|
Sign up to set email alerts
|

CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 40 publications
0
10
0
Order By: Relevance
“…We present a comprehensive comparison of our method with the previous state-of-the-art approaches, namely SC3D [8], P2B [23], 3DSiamRPN [6], LTTR [5], MLVS-Net [30], BAT [34], PTT [24], V2B [10], CMT [9], PTTR [36], STNet [11], TAT [16], M2-Track [35] and CX-Track [31] on the KITTI dataset. The published results from corresponding papers are reported.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We present a comprehensive comparison of our method with the previous state-of-the-art approaches, namely SC3D [8], P2B [23], 3DSiamRPN [6], LTTR [5], MLVS-Net [30], BAT [34], PTT [24], V2B [10], CMT [9], PTTR [36], STNet [11], TAT [16], M2-Track [35] and CX-Track [31] on the KITTI dataset. The published results from corresponding papers are reported.…”
Section: Resultsmentioning
confidence: 99%
“…V2B [10] proposes to transform point features into a dense bird's eye view feature map to tackle the sparsity of point clouds. LTTR [5], PTTR [36], CMT [9] and STNet [11] introduce various attention mechanisms into the 3D SOT task for better target-specific feature propagation. PTTR [36] also proposes a light-weight Prediction Refinement Module for coarse-to-fine localization.…”
Section: Related Workmentioning
confidence: 99%
“…V2B (Hui et al 2021) performs Voxel-to-BEV transformation for object localization on the densified feature maps. Inspired by the success of Transformer (Vaswani et al 2017) on computer vision tasks (Liu et al 2021;Carion et al 2020a), several studies (Zhou et al 2022;Cui et al 2021;Shan et al 2021;Hui et al 2022;Guo et al 2022;Nie et al 2023;Xu et al 2023) incorporate Transformer for enhanced feature extraction and correlation modeling and achieve improved accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…V2B [33] designs a voxel-to-BEV object localization network to tackle sparse point clouds. Other techniques such as LTTR [34], PTT [35], PTTR [36], STNet [37], and CMT [38] develop sophisticated transformer structures to improve feature fusion or object localization. Nevertheless, none of them challenges q 1 p1, , s1 q 1 pi, , si q i pi, , si q i pK, , sK q K pK, , sK q K ... ... p1, , s1 q 1 pi, , si q i pK, , sK q K Vote Cluster Features 3D Proposals…”
Section: A 3d Siamese Trackingmentioning
confidence: 99%
“…As mentioned above, several trackers [34]- [38] based on transformer architecture have been introduced for 3D SOT on point clouds. These methods typically employ self-attention to refine features or cross-attention to facilitate interaction between the features extracted from the template and search regions.…”
Section: B Vision Transformermentioning
confidence: 99%