CAE-GReaT: Convolutional-Auxiliary Efficient Graph Reasoning Transformer for Dense Image Predictions

Zhang, Dong; Lin, Yi; Tang, Jinhui; Cheng, Kwang-Ting

doi:10.1007/s11263-023-01928-1

Cited by 2 publications

(1 citation statement)

References 73 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the contrary, deep learning significantly improves detection performance by automatically learning discriminative features. In recent years, due to the rapid development of deep learning technology, many excellent algorithms [3], [4], [5], [6] have emerged in the field of object detection.…”

Section: Introductionmentioning

confidence: 99%

Object Detection by Channel and Spatial Exchange for Multimodal Remote Sensing Imagery

Nan,

Zhao,

et al. 2024

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

Smart satellites and unmanned aerial vehicles (UAVs) are typically equipped with visible light and infrared (IR) spectrum sensors. However, achieving real-time object detection utilizing these multimodal data on such resource-limited devices is a challenging task. This paper proposes HyperYOLO, a realtime lightweight object detection framework for multimodal remote sensing images. First, we propose a lightweight multimodal fusion module named Channel and Spatial Exchange (CSE) to effectively extract complementary information from different modalities. The CSE module consists of two stages: channel exchange and spatial exchange. Channel exchange achieves global fusion by learning global weights to better utilize cross-channel information correlation, while spatial exchange captures details by considering spatial relationships to calibrate local fusion. Second, we propose an effective auxiliary branch module based on the feature pyramid network for super resolution (FPNSR) to enhance the framework's responsiveness to small objects by learning high-quality feature representations. Moreover, we embed a coordinate attention mechanism to assist our network in precisely localizing and attending to the objects of interest. The experimental results show that on the VEDAI remote sensing dataset, HyperYOLO achieves a 76.72% mAP 50 , surpassing the SOTA SuperYOLO by 1.63%. Meanwhile, the parameter size and GFLOPs of HyperYOLO are about 1.34 million (28%) and 3.97 (22%) less than SuperYOLO, respectively. In addition, HyperYOLO has a file size of only 7.3 MB after the removal of the auxiliary FPNSR branch, which makes it easier to deploy on these resource-constrained devices.

show abstract