2021
DOI: 10.3390/s21062113
|View full text |Cite
|
Sign up to set email alerts
|

Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking from View Aggregation

Abstract: Autonomous systems need to localize and track surrounding objects in 3D space for safe motion planning. As a result, 3D multi-object tracking (MOT) plays a vital role in autonomous navigation. Most MOT methods use a tracking-by-detection pipeline, which includes both the object detection and data association tasks. However, many approaches detect objects in 2D RGB sequences for tracking, which lacks reliability when localizing objects in 3D space. Furthermore, it is still challenging to learn discriminative fe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 47 publications
(71 reference statements)
0
3
0
Order By: Relevance
“…However, due to the increase of additional modules and the independent execution of each module, the inference speed of the improved model has dropped significantly. Considering the feature interaction between detected objects in the different frames, Bochinski et al [15] fused the appearance feature and the motion feature captured from 2D RGB images and 3D point clouds to better exploit the correlation between each pair of objects in the adjacent frames. Moreover, potential sharing structure of multi-task is not considered by the above-mentioned models.…”
Section: Multi-object Trackingmentioning
confidence: 99%
“…However, due to the increase of additional modules and the independent execution of each module, the inference speed of the improved model has dropped significantly. Considering the feature interaction between detected objects in the different frames, Bochinski et al [15] fused the appearance feature and the motion feature captured from 2D RGB images and 3D point clouds to better exploit the correlation between each pair of objects in the adjacent frames. Moreover, potential sharing structure of multi-task is not considered by the above-mentioned models.…”
Section: Multi-object Trackingmentioning
confidence: 99%
“…Chen et al's system is an autonomous system [130], which used 3D information from 3D point clouds [131] and relation conv for pair of objects correlation. Wang et al proposed unsupervised learning with a Siamese correlation filter network [132], which uses a multi-frame validation scheme and cost-sensitive loss and which have real-time speed using unsupervised learning.…”
Section: Affinitymentioning
confidence: 99%
“…Recent online 3D MOT methods often follow a tracking-by-detection pipeline with two steps: (1) Given trajectories associated up to the last frame and detections in the current frame, an affinity matrix is computed, where each entry represents the similarity value between a past trajectory and a current detection; (2) Given the affinity matrix, the Hungarian algorithm [23] is used to obtain a locally optimal matching, which entails making a hard assignment about which past trajectory a current detection is assigned to, so that trajectories can be updated to the current frame. Though significant progress has been made recently for the first step, for example, by improving the affinity matrix estimation using Graph Neural Networks [9], [24], [25] and multi-modal feature learning [1], [2], step two has largely remained the same. In other words, modern 3D MOT methods typically generate a single set of trajectories via the Hungarian algorithm at inference time, which induces tracking errors that can be detrimental to prediction.…”
Section: Related Workmentioning
confidence: 99%