Improved YOLOX-X based UAV aerial photography object detection algorithm

Wang, Xin; He, Ning; Hong, Chen; Wang, Qi; Chen, Ming

doi:10.1016/j.imavis.2023.104697

Cited by 26 publications

(3 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The neck network structure mainly uses the Feature Pyramid Network (FPN) [33] and the Pyramid Attention Network (PAN) [34]. FPN adopts the top-down paths and lateral connections and fuses the underlying high-resolution features with the top-level semantic information.…”

Section: Yolov5smentioning

confidence: 99%

CCDS-YOLO: Multi-Category Synthetic Aperture Radar Image Object Detection Model Based on YOLOv5s

Huang,

Liu,

Liu

et al. 2023

Electronics

View full text Add to dashboard Cite

Synthetic Aperture Radar (SAR) is an active microwave sensor that has attracted widespread attention due to its ability to observe the ground around the clock. Research on multi-scale and multi-category target detection methods holds great significance in the fields of maritime resource management and wartime reconnaissance. However, complex scenes often influence SAR object detection, and the diversity of target scales also brings challenges to research. This paper proposes a multi-category SAR image object detection model, CCDS-YOLO, based on YOLOv5s, to address these issues. Embedding the Convolutional Block Attention Module (CBAM) in the feature extraction part of the backbone network enables the model’s ability to extract and fuse spatial information and channel information. The 1 × 1 convolution in the feature pyramid network and the first layer convolution of the detection head are replaced with the expanded convolution, Coordinate Conventional (CoordConv), forming a CRD-FPN module. This module more accurately perceives the spatial details of the feature map, enhancing the model’s ability to handle regression tasks compared to traditional convolution. In the detector segment, a decoupled head is utilized for feature extraction, offering optimal and effective feature information for the classification and regression branches separately. The traditional Non-Maximum Suppression (NMS) is substituted with the Soft Non-Maximum Suppression (Soft-NMS), successfully reducing the model’s duplicate detection rate for compact objects. Based on the experimental findings, the approach presented in this paper demonstrates excellent results in multi-category target recognition for SAR images. Empirical comparisons are conducted on the filtered MSAR dataset. Compared with YOLOv5s, the performance of CCDS-YOLO has been significantly improved. The mAP@0.5 value increases by 3.3% to 92.3%, the precision increases by 3.4%, and the mAP@0.5:0.95 increases by 6.7%. Furthermore, in comparison with other mainstream detection models, CCDS-YOLO stands out in overall performance and anti-interference ability.

show abstract

Section: Yolov5smentioning

confidence: 99%

CCDS-YOLO: Multi-Category Synthetic Aperture Radar Image Object Detection Model Based on YOLOv5s

Huang,

Liu,

Liu

et al. 2023

Electronics

View full text Add to dashboard Cite

show abstract

“…Wu et al [20] proposed a multi-branch parallel network that utilizes multi-branch up-sampling and down-sampling to reduce information loss when the size of a feature map changes. Wang et al [21] added an ultra-lightweight subspace attention module (ULSAM) to a path aggregation network to highlight object features. Huang et al [22] proposed a feature-guided enhancement (FGE) module that designs two nonlinear operators to learn discriminant information.…”

Section: Introductionmentioning

confidence: 99%

MFEFNet: A Multi-Scale Feature Information Extraction and Fusion Network for Multi-Scale Object Detection in UAV Aerial Images

Zhou,

Zhao,

Wan

et al. 2024

Drones

View full text Add to dashboard Cite

Unmanned aerial vehicles (UAVs) are now widely used in many fields. Due to the randomness of UAV flight height and shooting angle, UAV images usually have the following characteristics: many small objects, large changes in object scale, and complex background. Therefore, object detection in UAV aerial images is a very challenging task. To address the challenges posed by these characteristics, this paper proposes a novel UAV image object detection method based on global feature aggregation and context feature extraction named the multi-scale feature information extraction and fusion network (MFEFNet). Specifically, first of all, to extract the feature information of objects more effectively from complex backgrounds, we propose an efficient spatial information extraction (SIEM) module, which combines residual connection to build long-distance feature dependencies and effectively extracts the most useful feature information by building contextual feature relations around objects. Secondly, to improve the feature fusion efficiency and reduce the burden brought by redundant feature fusion networks, we propose a global aggregation progressive feature fusion network (GAFN). This network adopts a three-level adaptive feature fusion method, which can adaptively fuse multi-scale features according to the importance of different feature layers and reduce unnecessary intermediate redundant features by utilizing the adaptive feature fusion module (AFFM). Furthermore, we use the MPDIoU loss function as the bounding-box regression loss function, which not only enhances model robustness to noise but also simplifies the calculation process and improves the final detection efficiency. Finally, the proposed MFEFNet was tested on VisDrone and UAVDT datasets, and the mAP0.5 value increased by 2.7% and 2.2%, respectively.

show abstract

“…Although data augmentation improved the detection of small objects to some extent, it merely increased the proportion of small objects in the data, lacking the integration and utilization of semantic information. Wang et al [ 15 ] introduced the Ultra-lightweight Subspace Attention Module (ULSAM) into the network structure, with an emphasis on target features and the attenuation of background features. However, this module primarily incorporated spatial information, neglecting channel information, and resulting in suboptimal small object detection performance, especially in densely occluded scenes.…”

Section: Introductionmentioning

confidence: 99%

Small Target-YOLOv5: Enhancing the Algorithm for Small Object Detection in Drone Aerial Imagery Based on YOLOv5

Zhou,

Su,

et al. 2023

Sensors

View full text Add to dashboard Cite

Object detection in drone aerial imagery has been a consistent focal point of research. Aerial images present more intricate backgrounds, greater variation in object scale, and a higher occurrence of small objects compared to standard images. Consequently, conventional object detection algorithms are often unsuitable for direct application in drone scenarios. To address these challenges, this study proposes a drone object detection algorithm model based on YOLOv5, named SMT-YOLOv5 (Small Target-YOLOv5). The enhancement strategy involves improving the feature fusion network by incorporating detection layers and implementing a weighted bidirectional feature pyramid network. Additionally, the introduction of the Combine Attention and Receptive Fields Block (CARFB) receptive field feature extraction module and DyHead dynamic target detection head aims to broaden the receptive field, mitigate information loss, and enhance perceptual capabilities in spatial, scale, and task domains. Experimental validation on the VisDrone2021 dataset confirms a significant improvement in the target detection accuracy of SMT-YOLOv5. Each improvement strategy yields effective results, raising the average precision by 12.4 percentage points compared to the original method. Detection improvements for large, medium, and small targets increase by 6.9%, 9.5%, and 7.7%, respectively, compared to the original method. Similarly, applying the same improvement strategies to the low-complexity YOLOv8n results in SMT-YOLOv8n, which is comparable in complexity to SMT-YOLOv5s. The results indicate that, relative to SMT-YOLOv8n, SMT-YOLOv5s achieves a 2.5 percentage point increase in average precision. Furthermore, comparative experiments with other enhancement methods demonstrate the effectiveness of the improvement strategies.

show abstract

Improved YOLOX-X based UAV aerial photography object detection algorithm

Cited by 26 publications

References 49 publications

CCDS-YOLO: Multi-Category Synthetic Aperture Radar Image Object Detection Model Based on YOLOv5s

CCDS-YOLO: Multi-Category Synthetic Aperture Radar Image Object Detection Model Based on YOLOv5s

MFEFNet: A Multi-Scale Feature Information Extraction and Fusion Network for Multi-Scale Object Detection in UAV Aerial Images

Small Target-YOLOv5: Enhancing the Algorithm for Small Object Detection in Drone Aerial Imagery Based on YOLOv5

Contact Info

Product

Resources

About