2023
DOI: 10.3390/rs15092395
|View full text |Cite
|
Sign up to set email alerts
|

DCAT: Dual Cross-Attention-Based Transformer for Change Detection

Abstract: Several transformer-based methods for change detection (CD) in remote sensing images have been proposed, with Siamese-based methods showing promising results due to their two-stream feature extraction structure. However, these methods ignore the potential of the cross-attention mechanism to improve change feature discrimination and thus, may limit the final performance. Additionally, using either high-frequency-like fast change or low-frequency-like slow change alone may not effectively represent complex bi-te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 66 publications
0
2
0
Order By: Relevance
“…The structure of this network is complicated, and its computational efficiency is low. Zhou et al [31] introduced a dual cross-attention transformer (DCAT) network. This network is designed to extract both low-frequency and high-frequency information from input images through the computation of two distinct types of cross-attention features.…”
Section: Related Workmentioning
confidence: 99%
“…The structure of this network is complicated, and its computational efficiency is low. Zhou et al [31] introduced a dual cross-attention transformer (DCAT) network. This network is designed to extract both low-frequency and high-frequency information from input images through the computation of two distinct types of cross-attention features.…”
Section: Related Workmentioning
confidence: 99%
“…Immediately after, we use the DTT decoder to reweight the original features based on the generated tokens to obtain the refined features considering the dual-temporal contextual relationships. While some works, such as [21][22][23], employ transformers based on cross-attention for change detection, their proposed cross-attention merely involves the straightforward calculation of attention matrices using the query (Q) from another temporal phase and the key (K) from the current temporal phase. This approach fails to adequately model the non-local structural relationships depicted in Figure 1.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, simple skip connections in encoder and decoder features can lead to semantic gaps. Inspired by achievements in the field of medical image segmentation [27][28][29], this paper introduces a Dual Cross-Attention module (DCA) based on the UNET architecture, incorporating Channel Cross-Attention (CCA) and Spatial Cross-Attention (SCA) mechanisms. This DCA module adaptively captures channel and spatial dependencies between multi-scale encoder features in sequence to address the semantic gaps between encoder and decoder features in the UNET architecture.…”
Section: Introductionmentioning
confidence: 99%