2022
DOI: 10.1109/tpami.2020.3034435
|View full text |Cite
|
Sign up to set email alerts
|

A Bayesian Filter for Multi-View 3D Multi-Object Tracking With Occlusion Handling

Abstract: This paper proposes an online multi-camera multi-object tracker that only requires monocular detector training, independent of the multi-camera configurations, allowing seamless extension/deletion of cameras without retraining effort. The proposed algorithm has a linear complexity in the total number of detections across the cameras, and hence scales gracefully with the number of cameras. It operates in the 3D world frame, and provides 3D trajectory estimates of the objects. The key innovation is a high fideli… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 46 publications
(17 citation statements)
references
References 63 publications
0
14
0
Order By: Relevance
“…The pipeline to estimate the 3D location of pedestrians in multi-camera scenarios, in a generalizable manner, often employs 2D monocular pedestrian detectors [7,15,14] and later fuse their results based on multi-view properties [16,23,17]. In this scenario, one way to make 3D pedestrian detection robust to domain shift, therefore generalizable, is to use monocular person detectors that do not need retraining for a specific target domain [19,9,12,7].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The pipeline to estimate the 3D location of pedestrians in multi-camera scenarios, in a generalizable manner, often employs 2D monocular pedestrian detectors [7,15,14] and later fuse their results based on multi-view properties [16,23,17]. In this scenario, one way to make 3D pedestrian detection robust to domain shift, therefore generalizable, is to use monocular person detectors that do not need retraining for a specific target domain [19,9,12,7].…”
Section: Related Workmentioning
confidence: 99%
“…In this scenario, one way to make 3D pedestrian detection robust to domain shift, therefore generalizable, is to use monocular person detectors that do not need retraining for a specific target domain [19,9,12,7]. Another advantage of monocular detectors re-use is to simplify setup requirements, easing cameras addition/removal/combination [17].…”
Section: Related Workmentioning
confidence: 99%
“…The pipeline to estimate the 3D location of pedestrians in multi-camera scenarios, in a generalizable manner, often employs 2D monocular pedestrian detectors [7,15,14] and later fuse their results based on multi-view properties [16,23,17]. In this scenario, one way to make 3D pedestrian detection robust to domain shift, therefore generalizable, is to use monocular person detectors that do not need retraining for a specific target domain [19,9,12,7].…”
Section: Related Workmentioning
confidence: 99%
“…In this scenario, one way to make 3D pedestrian detection robust to domain shift, therefore generalizable, is to use monocular person detectors that do not need retraining for a specific target domain [19,9,12,7]. Another advantage of monocular detectors re-use is to simplify setup requirements, easing cameras addition/removal/combination [17].…”
Section: Related Workmentioning
confidence: 99%
“…With partial occlusion, the deep neural network based classifiers are less robust compared to humans [6], and it worsens the performance of detectors [7]. Therefore, occlusion handling has been studied extensively such as in pedestrian detection [8] [9] [10] [11], object tracking [12] [13] [14], face detection [15] [16], stereo images [17], car detection [18] [19], semantic part detection [20] [21], etc. However due to a huge number of variations in object category and instances, occlusion handling in generic object detection from a single still image is much harder.…”
Section: Introductionmentioning
confidence: 99%