2021
DOI: 10.48550/arxiv.2110.14921
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

3D Object Tracking with Transformer

Abstract: Feature fusion and similarity computation are two core problems in 3D object tracking, especially for object tracking using sparse and disordered point clouds. Feature fusion could make similarity computing more efficient by including target object information. However, most existing LiDAR-based approaches directly use the extracted point cloud feature to compute similarity while ignoring the attention changes of object regions during tracking. In this paper, we propose a feature fusion network based on transf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
15
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(22 citation statements)
references
References 25 publications
(41 reference statements)
0
15
0
Order By: Relevance
“…V2B [9] proposes a voxel-to-BEV (Bird's Eye View) target localization network, which projects the point features into a dense BEV feature map to tackle the sparsity of point clouds. Inspired by the success of transformers in 2D vision tasks [2,4], LTTR [3] adopts a transformerbased architecture to fuse features from two branches and propagate target cues. PTT [24] integrates a transformer module into the P2B architecture to refine point features.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…V2B [9] proposes a voxel-to-BEV (Bird's Eye View) target localization network, which projects the point features into a dense BEV feature map to tackle the sparsity of point clouds. Inspired by the success of transformers in 2D vision tasks [2,4], LTTR [3] adopts a transformerbased architecture to fuse features from two branches and propagate target cues. PTT [24] integrates a transformer module into the P2B architecture to refine point features.…”
Section: Related Workmentioning
confidence: 99%
“…We make comprehensive comparisons on the KITTI dataset with previous state-of-the-art methods, including SC3D [7], P2B [22], 3DSiamRPN [5], LTTR [3], MLVS-Net [31], BAT [34], PTT [24], V2B [9], PTTR [36], STNet [10] and M2-Track [35]. As illustrated in Tab.…”
Section: Comparison With State Of the Artsmentioning
confidence: 99%
See 1 more Smart Citation
“…Results on KITTI. We compare M 2 -Track with seven topperformance approaches [6,8,10,13,23,26,42,43], which have published results on KITTI. As shown in Tab.…”
Section: Comparison With State-of-the-artsmentioning
confidence: 99%
“…Our method perfectly keeps track of the target while BAT almost fails. To handle such fastmoving objects, [6,42] leverage BEV-based RPN to generate high-recall proposals from a larger search region. In contrast, we handle this simply by motion modeling without sophisticated architectures.…”
Section: Comparison With State-of-the-artsmentioning
confidence: 99%