F-Siamese Tracker: A Frustum-based Double Siamese Network for 3D Single Object Tracking

Zou, Hao; Cui, Jinhao; Kong, Xin; Zhang, Chujuan; Liu, Yong; Wen, Feng; Li, Wanlong

doi:10.1109/iros45743.2020.9341120

Cited by 24 publications

(22 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Previous works usually focus on RGB-D data [2,14], which heavily depend on visual features. Recently, with the development of 3D vision methods, there are many LiDAR-based 3D object tracking works [11,20,30]. For example, Giancola et al [11] used point clouds to track object in LiDAR space based on computing the cosine similarity between template and search branch.…”

Section: D Object Trackingmentioning

confidence: 99%

“…However, they ignored the characteristics of the point clouds. Zou et al [30] leveraged RGB image feature to generate 3D search space, and used point clouds feature to track. Based on [11], Qi et al [20] proposed a feature fusion module to augment search point features and achieved state-of-the-art tracking performance.…”

Section: D Object Trackingmentioning

confidence: 99%

“…Previous works use shape completion [11], image prior [30], or feature augmentation [20] to deal with the above problems. Although they achieve better tracking performance, they usually ignore the attention changes in different regions of the object during tracking.…”

Section: Introductionmentioning

confidence: 99%

“…Code is available at: https://github.com/3bobo/lttr.Recently, LiDAR-based 3D object tracking has been received more and more attention. Benefiting from the development of visual tracking [1,7,13,15,16], most 3D tracking methods [11,20,30] also use the Siamese-like tracking pipeline. The pipeline first inputs template point clouds of the target object and search point clouds of the current frame to its top and bottom branches respectively, then fuses the two-branch features based on similarity.…”

mentioning

confidence: 99%

See 3 more Smart Citations

3D Object Tracking with Transformer

Cui¹,

Fang²,

Shan³

et al. 2021

Preprint

View full text Add to dashboard Cite

Feature fusion and similarity computation are two core problems in 3D object tracking, especially for object tracking using sparse and disordered point clouds. Feature fusion could make similarity computing more efficient by including target object information. However, most existing LiDAR-based approaches directly use the extracted point cloud feature to compute similarity while ignoring the attention changes of object regions during tracking. In this paper, we propose a feature fusion network based on transformer architecture. Benefiting from the self-attention mechanism, the transformer encoder captures the inter-and intra-relations among different regions of the point cloud. By using cross-attention, the transformer decoder fuses features and includes more target cues into the current point cloud feature to compute the region attentions, which makes the similarity computing more efficient. Based on this feature fusion network, we propose an end-to-end point cloud object tracking framework, a simple yet effective method for 3D object tracking using point clouds. Comprehensive experimental results on the KITTI dataset show that our method achieves new state-of-the-art performance. Code is available at: https://github.com/3bobo/lttr.Recently, LiDAR-based 3D object tracking has been received more and more attention. Benefiting from the development of visual tracking [1,7,13,15,16], most 3D tracking methods [11,20,30] also use the Siamese-like tracking pipeline. The pipeline first inputs template point clouds of the target object and search point clouds of the current frame to its top and bottom branches respectively, then fuses the two-branch features based on similarity. Finally, the fused features are used to localize the position of the object to be tracked. However, compared with visual tracking, LiDAR-based tracking has more challenges due to the sparsity and disorder of the point clouds. For example, the point clouds will become much sparser with the increasing distance of the object, which hinders the feature extraction. Meanwhile,

show abstract

Section: D Object Trackingmentioning

confidence: 99%

Section: D Object Trackingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

3D Object Tracking with Transformer

Cui¹,

Fang²,

Shan³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Specially for 3D tracking, Giancola et al [2] introduced completion regularization to train a Siamese network. Subsequently, in light of the limitation of candidate box generation, Qi et al [16] designed a point-to-box network, Zou [40] reduced redundant search space using a 3D frustum, and Fang et al extended the region proposal network into pointNet++ [41] for 3D tracking. Nevertheless, all of above methods put more emphasis on distinguishing the target from a lot of proposals.…”

Section: D Point Cloud Trackingmentioning

confidence: 99%

Learning the Incremental Warp for 3D Vehicle Tracking in LiDAR Point Clouds

Tian

Liu

Li³

et al. 2021

Remote Sensing

View full text Add to dashboard Cite

Object tracking from LiDAR point clouds, which are always incomplete, sparse, and unstructured, plays a crucial role in urban navigation. Some existing methods utilize a learned similarity network for locating the target, immensely limiting the advancements in tracking accuracy. In this study, we leveraged a powerful target discriminator and an accurate state estimator to robustly track target objects in challenging point cloud scenarios. Considering the complex nature of estimating the state, we extended the traditional Lucas and Kanade (LK) algorithm to 3D point cloud tracking. Specifically, we propose a state estimation subnetwork that aims to learn the incremental warp for updating the coarse target state. Moreover, to obtain a coarse state, we present a simple yet efficient discrimination subnetwork. It can project 3D shapes into a more discriminatory latent space by integrating the global feature into each point-wise feature. Experiments on KITTI and PandaSet datasets showed that compared with the most advanced of other methods, our proposed method can achieve significant improvements—in particular, up to 13.68% on KITTI.

show abstract

Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline

Yang

Zhang

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Where a licence is displayed above, please note the terms and conditions of the licence govern your use of this document.When citing, please reference the published version. Take down policy While the University of Birmingham exercises care and attention in making items available there are rare occasions when an item has been uploaded in error or has been deemed to be commercially or otherwise sensitive.

show abstract

F-Siamese Tracker: A Frustum-based Double Siamese Network for 3D Single Object Tracking

Cited by 24 publications

References 15 publications

3D Object Tracking with Transformer

3D Object Tracking with Transformer

Learning the Incremental Warp for 3D Vehicle Tracking in LiDAR Point Clouds

Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline

Contact Info

Product

Resources

About