2020
DOI: 10.1007/978-3-030-58571-6_1
|View full text |Cite
|
Sign up to set email alerts
|

Multiview Detection with Feature Perspective Transformation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
43
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 71 publications
(48 citation statements)
references
References 40 publications
1
43
0
Order By: Relevance
“…In this sense, Baqué et al [1] present a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs) to handle the pairwise matching ambiguities between the observed pedestri-ans. Alternatively, MVDet [10] aggregates the multi-view people detection information by applying a feature perspective transform to place all ground heatmaps (and later locations) of pedestrians in the same coordinate space. Similarly, DMCT [21] proposes a perspective-aware network, which produces distorted detection blobs (related to the camera's perspective).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In this sense, Baqué et al [1] present a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs) to handle the pairwise matching ambiguities between the observed pedestri-ans. Alternatively, MVDet [10] aggregates the multi-view people detection information by applying a feature perspective transform to place all ground heatmaps (and later locations) of pedestrians in the same coordinate space. Similarly, DMCT [21] proposes a perspective-aware network, which produces distorted detection blobs (related to the camera's perspective).…”
Section: Related Workmentioning
confidence: 99%
“…Similar to many other existing multi-camera 3D pedestrian detection methods in the literature [10,21,23,16,1], our proposed approach is restricted to the ground plane. Therefore, it cannot correctly estimate the 3D location of people who are not standing on the ground (e.g., jumping).…”
Section: Limitationsmentioning
confidence: 99%
“…The methods mentioned above are valid on occlusion through multi-view information, but the fusion of laser point clouds and the generation of virtual multi-view took too long. The data collected by real multi-cameras is used to detect pedestrians [ 7 ], and the aggregation of multi-perspective information is completed by feature projection. There is still room for improvement in Recall rate and speed.…”
Section: Realted Workmentioning
confidence: 99%
“…However, none of them well in accuracy. MVDet found that combining a large convolution kernel and the convolution layer with a large receiving field can have a better effect on time and accuracy [ 7 ]. However, there is still plenty of room for improvement in accuracy and speed.…”
Section: Introductionmentioning
confidence: 99%
“…Human-centred computer vision tasks and applications, e.g., pedestrian detection [ 1 , 2 , 3 ], 3D human pose estimation [ 4 , 5 , 6 , 7 , 8 , 9 ] and body reconstruction [ 10 ], benefit from multiple-camera systems. Multiple cameras provide different views from a set of angles of the same moment, which expands coverage and mitigates the occlusion problem in single-camera systems.…”
Section: Introductionmentioning
confidence: 99%