2020
DOI: 10.48550/arxiv.2003.11753
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Real-time 3D Deep Multi-Camera Tracking

Quanzeng You,
Hao Jiang

Abstract: Tracking a crowd in 3D using multiple RGB cameras is a challenging task. Most previous multi-camera tracking algorithms are designed for offline setting and have high computational complexity. Robust real-time multi-camera 3D tracking is still an unsolved problem. In this work, we propose a novel end-to-end tracking pipeline, Deep Multi-Camera Tracking (DMCT), which achieves reliable real-time multi-camera people tracking. Our DMCT consists of 1) a fast and novel perspective-aware Deep GroudPoint Network, 2) a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 22 publications
(31 reference statements)
0
13
0
Order By: Relevance
“…Alternatively, MVDet [10] aggregates the multi-view people detection information by applying a feature perspective transform to place all ground heatmaps (and later locations) of pedestrians in the same coordinate space. Similarly, DMCT [21] proposes a perspective-aware network, which produces distorted detection blobs (related to the camera's perspective). This is followed by a fusion procedure for ground-plane occupancy heatmap estimation and the use of a Deep Glimpse Network for person detection.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Alternatively, MVDet [10] aggregates the multi-view people detection information by applying a feature perspective transform to place all ground heatmaps (and later locations) of pedestrians in the same coordinate space. Similarly, DMCT [21] proposes a perspective-aware network, which produces distorted detection blobs (related to the camera's perspective). This is followed by a fusion procedure for ground-plane occupancy heatmap estimation and the use of a Deep Glimpse Network for person detection.…”
Section: Related Workmentioning
confidence: 99%
“…• AH, which computes an average heatmap as described by You and Jiang [21]. It obtains each camera's heatmaps by considering a non-normalized Gaussian kernel in world ground plane coordinates centered at each ground point with a radius of 0.8m and σ = 10.1.…”
Section: Detection Performance Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…GMLP [31] is inspired by the probabilistic approach [13] and jointly uses CNNs and Conditional Random Fields to model explicitly an occupancy volume map given detections estimated from multiple cameras. More recently [46] (DMCT) propose deep learning to directly compute the occupancy volume by fusing feature maps extracted from CNNs at multi-camera views.…”
Section: Related Workmentioning
confidence: 99%
“…As a result, these errors will propagate throughout the tracking graph, affecting total performance. The centralized representation approach [46,48] on the other hand, is not plagued by such obstacles since each node in the tracking graph is an occupancy map (not a tracklet), which is estimated from all detections at each timeframe. Unfortunately, the cost of the data association step is increased due to a huge state space of variables and integrating advances from single-camera methods is more complicated.…”
Section: Introductionmentioning
confidence: 99%