2020
DOI: 10.1016/j.imavis.2020.103932
|View full text |Cite
|
Sign up to set email alerts
|

Joint detection and tracking in videos with identification features

Abstract: Recent works have shown that combining object detection and tracking tasks, in the case of video data, results in higher performance for both tasks, but they require a high frame-rate as a strict requirement for performance. This assumption is often violated in real-world applications, when models run on embedded devices, often at only a few frames per second. Videos at low frame-rate suffer from large object displacements. Here re-identification features may support to match large-displaced object detections,… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 28 publications
0
5
0
1
Order By: Relevance
“…An end‐to‐end joint neural network [46] addresses detection, tracking and re‐identification altogether, which detects objects, provides tracking associations across frames and estimates id‐features, to match objects across frames further apart. For a fairer comparison, Model2 [46], this version without identification features in the IoU estimation, is chosen as the comparison model.…”
Section: Methodsmentioning
confidence: 99%
“…An end‐to‐end joint neural network [46] addresses detection, tracking and re‐identification altogether, which detects objects, provides tracking associations across frames and estimates id‐features, to match objects across frames further apart. For a fairer comparison, Model2 [46], this version without identification features in the IoU estimation, is chosen as the comparison model.…”
Section: Methodsmentioning
confidence: 99%
“…The tracking of multiple objects (MOT) is an active field of research in computer vision. Recently, two main approaches have been the focus of research: tracking by detection (Bergmann et al, 2019;Pang et al, 2020;Peng et al, 2020;Wang et al, 2020;) and joint detection and tracking (Munjal et al, 2020;Feng et al, 2023;Wang et al, 2021). Joint detection and tracking methods detect and track objects within a single model, utilizing visual appearance to locate objects within images.…”
Section: Related Workmentioning
confidence: 99%
“…Various JoDT methods have been proposed in (Zhang et al 2021;Wang, Kitani, and Weng 2020;Peng et al 2020;Hu et al 2019;Shenoi et al 2020;Kim and Kim 2016;Kieritz, Hubner, and Arens 2018;Ke et al 2019;Munjal et al 2020). In the early phase, the idea of JoDT was adopted for 2D MOT.…”
Section: Joint Object Detection and Trackingmentioning
confidence: 99%
“…The appearance cues and motion context identified by the detector were used to perform tracking. The 2D MOT methods in this category include MPNTrack (Brasó and Leal-Taixé 2020), RNN tracker (Kieritz, Hubner, and Arens 2018), CDT (Kim and Kim 2016), PredNet (Munjal et al 2020), and Chained-Tracker (Peng et al 2020). The JoDT methods for 3D MOT include the mono3DT (Hu et al 2019) and JRMOT (Shenoi et al 2020).…”
Section: Introductionmentioning
confidence: 99%