2022
DOI: 10.1109/lgrs.2022.3165885
|View full text |Cite
|
Sign up to set email alerts
|

MSTDSNet-CD: Multiscale Swin Transformer and Deeply Supervised Network for Change Detection of the Fast-Growing Urban Regions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 37 publications
(19 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…Supervised learning methods for detecting land cover change require a large amount of labeled data and are widely utilized in the field of LCCD. Various algorithms have been proposed by researchers to address different challenges in this domain, and these methods are primarily categorized into two types: pixel-based and object-based approaches [64][65][66][67][68][69]. This section reviews and discusses in depth the existing methods of these two classes.…”
Section: Supervised Learning Methodsmentioning
confidence: 99%
“…Supervised learning methods for detecting land cover change require a large amount of labeled data and are widely utilized in the field of LCCD. Various algorithms have been proposed by researchers to address different challenges in this domain, and these methods are primarily categorized into two types: pixel-based and object-based approaches [64][65][66][67][68][69]. This section reviews and discusses in depth the existing methods of these two classes.…”
Section: Supervised Learning Methodsmentioning
confidence: 99%
“…In order to extract the depth features of dual-phase images and their high-frequency differences, this paper designs a 3branch network framework based on Deeplabv3, and replaces the CNN network with Swin Transformer structure [8] , which has stronger feature extraction capability of global receptor field. The Transformer structure originates from the field of natural language processing.…”
Section: Depth Feature Extraction Modulementioning
confidence: 99%
“…Inspired by this, ViT [16] pioneered the introduction of the transformer architecture to large-scale image recognition with great success. Researchers have progressively applied it to change detection tasks [17][18][19][20]. Self-attention is the core component of the transformer architecture, which explicitly models one-dimensional sequence relations.…”
Section: Introductionmentioning
confidence: 99%
“…However, these methods inadequately explore the attention mechanism between bi-temporal images in the CD task. The selfattention mechanism of [17][18][19][20] only models the non-local structural relations within a single temporal phase in Figure 1a and indiscriminately weights combinations of features in both changed and unchanged regions in the same way while ignoring the non-local structural relationships between the dual-temporal images in Figure 1b,c. Figure 1 illustrates the non-local structural relationships within the images, with the first row representing the "pre-change" image and the second row representing the "post-change" image.…”
Section: Introductionmentioning
confidence: 99%