Object Detection in Complex Road Scenarios: Improved YOLOv4-Tiny Algorithm

Zhu, Da; Xu, Guanghui; Zhou, Jie; Di, Enbiao; Li, Mingcan

doi:10.1109/ictc51749.2021.9441643

Cited by 19 publications

(8 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A real-time object recognition system that can recognize multiple objects within an image frame. YOLO has evolved into new versions over time, e.g., YOLOv2, YOLOv3, and YOLOv4 [25]. YOLOv4 is an object detection algorithm that evolves the YOLOv3 model.…”

Section: Proposed Two-stage Deep-learning Based Designmentioning

confidence: 99%

“…Moreover, the performances for average accuracy and frames per second of YOLOv4 are increased compared to YOLOv3. YOLOv4-tiny [25,26] is a compressed version of YOLOv4. Based on YOLOv4, it is proposed to simplify the network structure, reduce parameters and enable development on embedded devices, and YOLOv4-tiny based model performs faster training and faster detection by comparison with YOLOv4.…”

Section: Proposed Two-stage Deep-learning Based Designmentioning

confidence: 99%

See 1 more Smart Citation

Two-Stage Deep Learning Technology Based Iris Recognition Methodology for Biometric Authorization

Hsiao,

Chang,

Fan

2024

JAIT

View full text Add to dashboard Cite

In this study, the proposed iris recognition method uses the You Only Look Once (YOLO)-based deep learning algorithm with the procedure divided into two stages. After extraction of the iris and pupil from the images, the iris Region of Interest (ROI) is identified by the classifier. Iris localization, iris segmentation, and feature enhancement are three crucial processes when extracting the iris ROI, and they constitute the first stage. Iris localization is firstly discussed, and the three methods are proposed with the system performance analyzed from the perspective of both system safety and affordability. The main difference among these methods is their complexity. Iris segmentation is then introduced, and an experiment is conducted to evaluate system performance when images are preprocessed for inputs by different segmentation methods, including images with and without normalization. Normalization and its necessary or unnecessary role in identifying images with deep learning are then analyzed. Finally, an examination of how feature enhancement influences the results of the proposed method is outlined. For system safety analysis, the Equal Error Rate (EER) of the proposed design approaches near zero; for system affordability analysis, the accuracy of the proposed design can be up to 98%.

show abstract

Section: Proposed Two-stage Deep-learning Based Designmentioning

confidence: 99%

Section: Proposed Two-stage Deep-learning Based Designmentioning

confidence: 99%

Two-Stage Deep Learning Technology Based Iris Recognition Methodology for Biometric Authorization

Hsiao,

Chang,

Fan

2024

JAIT

View full text Add to dashboard Cite

show abstract

“…The role of Patch Merging is similar to the maximum pooling layer of CNN, but the maximum pooling used by CNN to achieve down sampling will discard some information, so using Patch Merging can increase the accuracy of the model. Another advantage of using Swin Transformer is that the core point of the algorithm uses the Swin Transformer Block, which consists of Window Multi-Head Self-Attention (W-MSA) [31][32][33] and Shifted-Window Multi-Head Self-Attention (SW-MSA) [34][35][36], as shown in Fig. 3.…”

Section: Backbone Selectionmentioning

confidence: 99%

Fine Grained Feature Extraction Model of Riot-related Images Based on YOLOv5

Su¹,

Yuan²,

Wang³

et al. 2023

Computer Systems Science and Engineering

View full text Add to dashboard Cite

With the rapid development of Internet technology, the type of information in the Internet is extremely complex, and a large number of riot contents containing bloody, violent and riotous components have appeared. These contents pose a great threat to the network ecology and national security. As a result, the importance of monitoring riotous Internet activity cannot be overstated. Convolutional Neural Network (CNN-based) target detection algorithm has great potential in identifying rioters, so this paper focused on the use of improved backbone and optimization function of You Only Look Once v5 (YOLOv5), and further optimization of hyperparameters using genetic algorithm to achieve fine-grained recognition of riot image content. First, the fine-grained features of riot-related images were identified, and then the dataset was constructed by manual annotation. Second, the training and testing work was carried out on the constructed dedicated dataset by supervised deep learning training. The research results have shown that the improved YOLOv5 network significantly improved the fine-grained feature extraction capability of riot-related images compared with the original YOLOv5 network structure, and the mean average precision (mAP) value was improved to 0.6128. Thus, it provided strong support for combating riot-related organizations and maintaining the online ecological environment.

show abstract

“…Hence, for the purpose of drone detection, a deep convolutional neural networkbased model known as YOLO (You Only Look Once), essentially a state-of-theart object detection model, is chosen and trained on a dataset of drone images. The parameters of the model have been tuned in such a way so as to better YOLOv3-tiny [2], YOLOv4 [5], YOLOv4-tiny [31], and we compare them on the basis of some performance metrics to choose the one best suited for our problem.…”

Section: Drone Detectionmentioning

confidence: 99%

Lightweight Multi-Drone Detection and 3D-Localization via YOLO

Sharma¹,

Nitik²,

Kothari³

2022

Preprint

View full text Add to dashboard Cite

In this work, we present and evaluate a method to perform real-time multiple drone detection and three-dimensional localization using state-of-the-art tiny-YOLOv4 object detection algorithm and stereo triangulation. Our computer vision approach eliminates the need for computationally expensive stereo matching algorithms, thereby significantly reducing the memory footprint and making it deployable on embedded systems. Our drone detection system is highly modular (with support for various detection algorithms) and capable of identifying multiple drones in a system, with real-time detection accuracy of up to 77% with an average FPS of 332 (on Nvidia Titan Xp). We also test the complete pipeline in AirSim environment, detecting drones at a maximum distance of 8 meters, with a mean error of 23% of the distance. We also release the source code for the project, with pre-trained models and the curated synthetic stereo dataset which can be found at github.com/aryanshar/swarm-detection

show abstract

Object Detection in Complex Road Scenarios: Improved YOLOv4-Tiny Algorithm

Cited by 19 publications

References 6 publications

Two-Stage Deep Learning Technology Based Iris Recognition Methodology for Biometric Authorization

Two-Stage Deep Learning Technology Based Iris Recognition Methodology for Biometric Authorization

Fine Grained Feature Extraction Model of Riot-related Images Based on YOLOv5

Lightweight Multi-Drone Detection and 3D-Localization via YOLO

Contact Info

Product

Resources

About