Infrared Small-Dim Target Detection with Transformer under Complex Backgrounds

Liu, Fangcen; Gao, Chenqiang; Chen, Fang; Meng, Deyu; Zuo, Wangmeng; Gao, Xinbo

doi:10.48550/arxiv.2109.14379

Cited by 8 publications

(11 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We select some traditinal methods: Top-Hat [37], LCM [3], WLDM [21], NARM [38], PSTNN [39], IPI [14], RIPT [8], NIPPS [9] and several open source deep learning SOTA methods including MDvsFA [], ACM [10], ALC [11], AGCP [41] and Transformer [19] which for comparison. The results are shown in Table 2, deep learning methods basically perform better than traditional ones due to their great power of feature extraction and generalization.…”

Section: Quantitative Resultsmentioning

confidence: 99%

“…Dai et al [10] proposed an asymmetric contextual modulation to help network performance well and introduced the first public ISOS dataset SIRST in real scenes, Dai et al [11] further applied a handcraft dilated local contrast measure into network. Liu et al [19] firstly introduced multi-head self-attention into ISOS tasks and got a good result. Zhang et al [41] proposed AGPCNet with attentionguided context block and context pyramid module.…”

Section: Isosmentioning

confidence: 99%

“…Though existing deep learning methods have got great results, they mostly focus on the feature fusion. Liu et al [19] first introduced transformer block into ISOS task and got great results. Zhang et al introduced attention-guided context module to help AGPCNet [41] focus on small object.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Local Contrast and Global Contextual Information Make Infrared Small Object Salient Again

Wang¹,

Wang²,

Pan³

2023

Preprint

View full text Add to dashboard Cite

Infrared small object detection (ISOS) aims to segment small objects only covered with several pixels from clutter background in infrared images. It's of great challenge due to: 1) small objects lack of sufficient intensity, shape and texture information; 2) small objects are easily lost in the process where detection models, say deep neural networks, obtain high-level semantic features and image-level receptive fields through successive downsampling. This paper proposes a reliable detection model for ISOS, dubbed UCFNet, which can handle well the two issues. It builds upon central difference convolution (CDC) and fast Fourier convolution (FFC). On one hand, CDC can effectively guide the network to learn the contrast information between small objects and the background, as the contrast information is very essential in human visual system dealing with the ISOS task. On the other hand, FFC can gain image-level receptive fields and extract global information while preventing small objects from being overwhelmed. Experiments on several public datasets demonstrate that our method significantly outperforms the state-of-the-art ISOS models, and can provide useful guidelines for designing better ISOS deep models.

show abstract

Section: Quantitative Resultsmentioning

confidence: 99%

Section: Isosmentioning

confidence: 99%

See 1 more Smart Citation

Local Contrast and Global Contextual Information Make Infrared Small Object Salient Again

Wang¹,

Wang²,

Pan³

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…MAResU-Net [42] add the self-attention module to CNN for remote sensing image segmentation. After obtaining image features from CNN, Liu et al adopt the self-attention mechanism to learn the interaction information of image features in a larger range [43]. Unlike it, our network extracts features by a pure transformer structure and does not utilize the convolutional backbone network.…”

Section: Transformer For Computer Visionmentioning

confidence: 99%

IRSTFormer: A Hierarchical Vision Transformer for Infrared Small Target Detection

Chen

Tan

2022

Remote Sensing

View full text Add to dashboard Cite

Infrared small target detection occupies an important position in the infrared search and track system. The most common size of infrared images has developed to 640×512. The field-of-view (FOV) also increases significantly. As the result, there is more interference that hinders the detection of small targets in the image. However, the traditional model-driven methods do not have the capability of feature learning, resulting in poor adaptability to various scenes. Owing to the locality of convolution kernels, recent convolutional neural networks (CNN) cannot model the long-range dependency in the image to suppress false alarms. In this paper, we propose a hierarchical vision transformer-based method for infrared small target detection in larger size and FOV images of 640×512. Specifically, we design a hierarchical overlapped small patch transformer (HOSPT), instead of the CNN, to encode multi-scale features from the single-frame image. For the decoder, a top-down feature aggregation module (TFAM) is adopted to fuse features from adjacent scales. Furthermore, after analyzing existing loss functions, a simple yet effective combination is exploited to optimize the network convergence. Compared to other state-of-the-art methods, the normalized intersection-over-union (nIoU) on our IRST640 dataset and public SIRST dataset reaches 0.856 and 0.758. The detailed ablation experiments are conducted to validate the effectiveness and reasonability of each component in the method.

show abstract

“…4 (b) that more than half of the targets contains about 20 pixels. Usually, small targets (e.g., aircraft, missiles) move rapidly in complex and variable clutter, making infrared images have a very low signal-to-clutter ratio (SCR) [61]. SCR [48], [62] is used to measure the target intensity and background intensity.…”

Section: Infrared Dim Small Target Datasetsmentioning

confidence: 99%