2022
DOI: 10.3390/s22031082
|View full text |Cite
|
Sign up to set email alerts
|

Adopting the YOLOv4 Architecture for Low-Latency Multispectral Pedestrian Detection in Autonomous Driving

Abstract: Detecting pedestrians in autonomous driving is a safety-critical task, and the decision to avoid a a person has to be made with minimal latency. Multispectral approaches that combine RGB and thermal images are researched extensively, as they make it possible to gain robustness under varying illumination and weather conditions. State-of-the-art solutions employing deep neural networks offer high accuracy of pedestrian detection. However, the literature is short of works that evaluate multispectral pedestrian de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(25 citation statements)
references
References 33 publications
(50 reference statements)
0
13
0
Order By: Relevance
“…MAF-YOLO (Xue et al, 2021) achieved higher accuracy of 87.8% but with the FPS of 40, which is slower than BlendNet. In (Roszyk et al, 2022), YOLOv4 and the tiny middle fusion approach to YOLOv4 were used in multi-spectral pedestrian detection. The Tiny version achieved 55.7%, which is less than ours (78.48), although a very high speed 410 was achieved due to optimisation using Tensorflow RT.…”
Section: Results With Public Dataset and Resource-constrained Devicementioning
confidence: 99%
“…MAF-YOLO (Xue et al, 2021) achieved higher accuracy of 87.8% but with the FPS of 40, which is slower than BlendNet. In (Roszyk et al, 2022), YOLOv4 and the tiny middle fusion approach to YOLOv4 were used in multi-spectral pedestrian detection. The Tiny version achieved 55.7%, which is less than ours (78.48), although a very high speed 410 was achieved due to optimisation using Tensorflow RT.…”
Section: Results With Public Dataset and Resource-constrained Devicementioning
confidence: 99%
“…Therefore, we design two independent CNN layers to extract features from both modalities and fuse them in middle convolution layers, as it is shown in Figure 4 . For the fusion method, we employ the same technique called Halfway Fusion referring to previous studies [ 14 , 16 , 17 ] to combine features from both modalities. Specifically, the feature fusion process can be simply formulated, as follows: where i indicates the index of the layer in which the feature fusion occurs.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…J. Li et al [ 14 ] designed four Faster-RCNN [ 15 ] based DNN architectures to combine RGB and thermal features at various stages, and empirically analyzed the results to determine the most effective fusion method. To achieve low latency for real-world applications, Roszyk et al and Cao et al [ 16 , 17 ] conducted similar experiments with a different architecture baseline, YOLOv4 [ 18 ]. According to the findings of these three studies, Halfway Fusion, which combines RGB and thermal branches in middle convolution layers, yields the best results regardless of the base architecture.…”
Section: Introductionmentioning
confidence: 99%
“…Specifically, there are two types of illumination-aware weighting designs: built into the detector [43] or independent illumination network [18], depending on whether it is built on the computed features in the Faster R-CNN detector. Meanwhile, there are also several recent works focusing on real-time multispectral pedestrian detection based on onestage frameworks such as YOLO and SSD [44][45][46]. Recently, Li et al [46] proposed a method integrating both feature-level fusion and decision-level fusion to ensure reliable detection.…”
Section: Multispectral Pedestrian Detectionmentioning
confidence: 99%