A Survey of Small Object Detection Based on Deep Learning in Aerial Images

Wang, Hua; Chen, Qili

doi:10.21203/rs.3.rs-3074407/v1

Cited by 2 publications

(3 citation statements)

References 141 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, the use of the MDFF method can increase the information contained in the proposals, which is highly beneficial for generating high-quality proposals. To demonstrate the feasibility of this viewpoint, we made a simple modification to the ROI-HEAD of faster R-CNN 32 by adding a frequency domain branch for class prediction and named it MDFF-faster R-CNN 33 . We trained it on a subset of the COCO 34 dataset (named person-car).…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Multidomain feature fusion method for small object classification: MDFF

Hu¹,

Shi

Zhang

et al. 2023

J. Electron. Imag.

View full text Add to dashboard Cite

The task of classifying small objects is still challenging for current deep learning classification models [such as convolutional neural networks (CNNs) and vision transformers (ViTs)]. We believe that these algorithms are not designed specifically for small targets, so their feature extraction abilities for small targets are insufficient. To improve the classification capabilities of CNN-based and ViT-based classification models for small objects, two multidomain feature fusion (MDFF) frameworks are proposed to increase the amount of feature information derived from images and they are called MDFF-ConvMixer and MDFF-ViT. Compared with the basic model, the uniquely added design includes frequency domain feature extraction and MDFF processes. In the frequency domain feature extraction part, the input image is first transformed into a frequency domain form through discrete cosine transform (DCT) transformation and then a three-dimensional matrix containing the frequency domain information is obtained via channel splicing and reshaping. In the MDFF part, MDFF-ConvMixer splices the spatial and frequency domain features by channel, whereas MDFF-ViT uses a cross-attention mechanism to fuse the spatial and frequency domain features. When targeting small target classification tasks, these two frameworks obviously improve the utilized classification algorithm. On the DOTA dataset and the CIFAR10 dataset with two downsampling operations, the accuracies of MDFF-ConvMixer relative to ConvMixer increase from 87.82% and 62.14% to 90.14% and 66.00%, respectively, and the accuracies of MDFF-ViT relative to the ViT increase from 79.22% and 36.2% to 88.15% and 59.23%, respectively.

show abstract

Section: Discussionmentioning

confidence: 99%

“…Visualization of detection results. (a)–(c) The original image of person-car (a subset of COCO 34 ), faster R-CNN 32 detection results, and MDFF-faster R-CNN 33 detection results, respectively. (d)–(f) The original image of UAV-human, 35 faster R-CNN detection results, and MDFF-faster R-CNN detection results, respectively.…”

Section: Discussionmentioning

confidence: 99%

Multidomain feature fusion method for small object classification: MDFF

Hu¹,

Shi

Zhang

et al. 2023

J. Electron. Imag.

View full text Add to dashboard Cite

show abstract

“…Enter feature diagram F. The channel attention module produces a one-dimensional channel attention map MC., The spatial attention module generates a two-dimensional spatial attention graph MS. Multiply F with MC to get the channel attention feature diagram F ',Then multiply F 'with MS to get the output feature diagram F" 15 ,The calculation formula is as follows…”

Section: Cbam Attention Mechanismmentioning

confidence: 99%

Defect detection method of lithium battery based on improved YOLOv7

Pan,

2024

Second International Conference on Physics, Photonics, and Optical Engineering (ICPPOE 2023)

View full text Add to dashboard Cite

For the traditional algorithm to detect lithium battery defects, the missing rate is high and the speed is slow, an improved YOLOv7 algorithm was proposed. Firstly, CBAM attention mechanism is added to feature extraction part, which can enhance network's representation ability. Secondly, In the feature fusion part, ConvNeXt lightweight module was used to replace the original ELAN module to reduce the model's complexity. Finally, SPD module is added before the detection head at the output end to increase focus on smaller goals with surface defects of lithium batteries at the output end. The results show that the optimization algorithm can improve the accuracy and speed of the lithium battery.The proposed algorithm achieves a 92.7% detection accuracy, surpassing the original network by 2.1%.

show abstract

A Survey of Small Object Detection Based on Deep Learning in Aerial Images

Cited by 2 publications

References 141 publications

Multidomain feature fusion method for small object classification: MDFF

Multidomain feature fusion method for small object classification: MDFF

Defect detection method of lithium battery based on improved YOLOv7

Contact Info

Product

Resources

About