Multiview Detection with Feature Perspective Transformation

Hou, Yunzhong; Zheng, Liang; Gould, Stephen J.

doi:10.1007/978-3-030-58571-6_1

Cited by 71 publications

(48 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this sense, Baqué et al [1] present a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs) to handle the pairwise matching ambiguities between the observed pedestri-ans. Alternatively, MVDet [10] aggregates the multi-view people detection information by applying a feature perspective transform to place all ground heatmaps (and later locations) of pedestrians in the same coordinate space. Similarly, DMCT [21] proposes a perspective-aware network, which produces distorted detection blobs (related to the camera's perspective).…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Generalizable Multi-Camera 3D Pedestrian Detection

Lima¹,

Roberto²,

Figueiredo³

et al. 2021

LatinX in AI at Computer Vision and Pattern Recognition Conference 2021

View full text Add to dashboard Cite

We present a multi-camera 3D pedestrian detection method that does not need to train using data from the target scene. We estimate pedestrian location on the ground plane using a novel heuristic based on human body poses and person’s bounding boxes from an off-the-shelf monocular detector. We then project these locations onto the world ground plane and fuse them with a new formulation of a clique cover problem. We also propose an optional step for exploiting pedestrian appearance during fusion by using a domain-generalizable person re identification model. We evaluated the proposed approach on the challenging WILDTRACK dataset. It obtained a MODA of 0.569 and an F-score of 0.78, superior to state-of-the-art generalizable detection techniques.

show abstract

Section: Related Workmentioning

confidence: 99%

“…Similar to many other existing multi-camera 3D pedestrian detection methods in the literature [10,21,23,16,1], our proposed approach is restricted to the ground plane. Therefore, it cannot correctly estimate the 3D location of people who are not standing on the ground (e.g., jumping).…”

Section: Limitationsmentioning

confidence: 99%

Generalizable Multi-Camera 3D Pedestrian Detection

Lima¹,

Roberto²,

Figueiredo³

et al. 2021

LatinX in AI at Computer Vision and Pattern Recognition Conference 2021

View full text Add to dashboard Cite

show abstract

“…The methods mentioned above are valid on occlusion through multi-view information, but the fusion of laser point clouds and the generation of virtual multi-view took too long. The data collected by real multi-cameras is used to detect pedestrians [ 7 ], and the aggregation of multi-perspective information is completed by feature projection. There is still room for improvement in Recall rate and speed.…”

Section: Realted Workmentioning

confidence: 99%

“…However, none of them well in accuracy. MVDet found that combining a large convolution kernel and the convolution layer with a large receiving field can have a better effect on time and accuracy [ 7 ]. However, there is still plenty of room for improvement in accuracy and speed.…”

Section: Introductionmentioning

confidence: 99%

Pedestrian Detection with Multi-View Convolution Fusion Algorithm

Liu

Han

Zhang

et al. 2022

Entropy

View full text Add to dashboard Cite

In recent years, the pedestrian detection technology of a single 2D image has been dramatically improved. When the scene becomes very crowded, the detection performance will deteriorate seriously and cannot meet the requirements of autonomous driving perception. With the introduction of the multi-view method, the task of pedestrian detection in crowded or fuzzy scenes has been significantly improved and has become a widely used method in autonomous driving. In this paper, we construct a double-branch feature fusion structure, the first branch adopts a lightweight structure, the second branch further extracts features and gets the feature map obtained from each layer. At the same time, the receptive field is enlarged by expanding convolution. To improve the speed of the model, the keypoint is used instead of the entire object for regression without an NMS post-processing operation. Meanwhile, the whole model can be learned from end to end. Even in the presence of many people, the method can still perform better on accuracy and speed. In the standard of Wildtrack and MultiviewX dataset, the accuracy and running speed both perform better than the state-of-the-art model, which has great practical significance in the autonomous driving field.

show abstract

“…Human-centred computer vision tasks and applications, e.g., pedestrian detection [ 1 , 2 , 3 ], 3D human pose estimation [ 4 , 5 , 6 , 7 , 8 , 9 ] and body reconstruction [ 10 ], benefit from multiple-camera systems. Multiple cameras provide different views from a set of angles of the same moment, which expands coverage and mitigates the occlusion problem in single-camera systems.…”

Section: Introductionmentioning

confidence: 99%

Semantically Synchronizing Multiple-Camera Systems with Human Pose Estimation

Zhe

Wang

Qin

2021

Sensors

View full text Add to dashboard Cite

Multiple-camera systems can expand coverage and mitigate occlusion problems. However, temporal synchronization remains a problem for budget cameras and capture devices. We propose an out-of-the-box framework to temporally synchronize multiple cameras using semantic human pose estimation from the videos. Human pose predictions are obtained with an out-of-the-shelf pose estimator for each camera. Our method firstly calibrates each pair of cameras by minimizing an energy function related to epipolar distances. We also propose a simple yet effective multiple-person association algorithm across cameras and a score-regularized energy function for improved performance. Secondly, we integrate the synchronized camera pairs into a graph and derive the optimal temporal displacement configuration for the multiple-camera system. We evaluate our method on four public benchmark datasets and demonstrate robust sub-frame synchronization accuracy on all of them.

show abstract

Multiview Detection with Feature Perspective Transformation

Cited by 71 publications

References 40 publications

Generalizable Multi-Camera 3D Pedestrian Detection

Generalizable Multi-Camera 3D Pedestrian Detection

Pedestrian Detection with Multi-View Convolution Fusion Algorithm

Semantically Synchronizing Multiple-Camera Systems with Human Pose Estimation

Contact Info

Product

Resources

About