2021
DOI: 10.1109/jstars.2021.3109237
|View full text |Cite
|
Sign up to set email alerts
|

Scale-Robust Deep-Supervision Network for Mapping Building Footprints From High-Resolution Remote Sensing Images

Abstract: Building footprint information is one of the key factors for sustainable urban planning and environmental monitoring. Mapping building footprints from remote sensing images is an important and challenging task in the earth observation field. Over the years, convolutional neural networks have shown outstanding improvements in the building extraction field due to their ability to automatically extract hierarchical features and make building predictions. However, as buildings are various in different sizes, scene… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 40 publications
0
5
0
Order By: Relevance
“…This is because the proposed multi-scale region attention mechanism and region consistency supervision strategy can better enhance the network's ability to extract semantic features from the remote sensing images, and by constraining and supervising the building regions and boundaries in the remote sensing images, it can make the network enrich the degree of attention to the local regions and contours of buildings, which improves the segmentation of the proposed network. VI, rows 1 to 10 are FCN [40], SegNet [41], DeeplabV3 [42], PSPNet [43], Unet [39], Res-Unet [44], HR-Net [45], Chen et al [46], BRRNet [20] and DS-Net [49] for semantic segmentation quantification, respectively. The experiments comparison results show that, compared with FCN, SegNet, DeeplabV3, PSPNet, Unet, Res-Unet, HR-Net, Chen et al, BRRNet and DS-Net, the Precision of the proposed algorithm has improved by 6.37%, 17.87% , 5.49%, 4.17%, 3.57%, 6.96%, 4.42%, 0.59% and -3.84%, respectively, and the Recall of the proposed algorithm has increased by 7.11%, 4.78%, 4.84%, 5.29%, 3.13%, 4.95%, 5.25%, 1.38% and 5.29%, respectively; the IoU of the proposed algorithm has improved by 9.53%, 16.7%, 7.39%, 6.73%, 4.79%, 8.56%, 6.88%, 1.28%, 0.31% and 0.98%, respectively; the F1-Score of the proposed algorithm has improved by 6.73%, 12.22%, 5.18%, 4.69%, 3.35%, 6.02%, 4.82%, 0.97%, 0.33% and 0.78%, respectively.…”
Section: F Massachusetts Dataset Experimental Results and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…This is because the proposed multi-scale region attention mechanism and region consistency supervision strategy can better enhance the network's ability to extract semantic features from the remote sensing images, and by constraining and supervising the building regions and boundaries in the remote sensing images, it can make the network enrich the degree of attention to the local regions and contours of buildings, which improves the segmentation of the proposed network. VI, rows 1 to 10 are FCN [40], SegNet [41], DeeplabV3 [42], PSPNet [43], Unet [39], Res-Unet [44], HR-Net [45], Chen et al [46], BRRNet [20] and DS-Net [49] for semantic segmentation quantification, respectively. The experiments comparison results show that, compared with FCN, SegNet, DeeplabV3, PSPNet, Unet, Res-Unet, HR-Net, Chen et al, BRRNet and DS-Net, the Precision of the proposed algorithm has improved by 6.37%, 17.87% , 5.49%, 4.17%, 3.57%, 6.96%, 4.42%, 0.59% and -3.84%, respectively, and the Recall of the proposed algorithm has increased by 7.11%, 4.78%, 4.84%, 5.29%, 3.13%, 4.95%, 5.25%, 1.38% and 5.29%, respectively; the IoU of the proposed algorithm has improved by 9.53%, 16.7%, 7.39%, 6.73%, 4.79%, 8.56%, 6.88%, 1.28%, 0.31% and 0.98%, respectively; the F1-Score of the proposed algorithm has improved by 6.73%, 12.22%, 5.18%, 4.69%, 3.35%, 6.02%, 4.82%, 0.97%, 0.33% and 0.78%, respectively.…”
Section: F Massachusetts Dataset Experimental Results and Analysismentioning
confidence: 99%
“…The proposed ReA-Net on the Aerial Imagery Dataset precision is 95.61%, Recall is 95.68%, IoU is 91.6%, MACs is 235.16G, and Params is 22.14M. Rows 1 to 12 are the semantic segmentation quantification of the comparisons algorithms such as FCN [40], SegNet [41], DeeplabV3 [42], PSPNet [43], Unet [39], Res-Unet [44] ,HR-Net [45], Deng et al [25], DR-Net [21], Chen et al [46], RSR-Net [48] and DS-Net [49] respectively. Compared with FCN, SegNet, DeeplabV3, PSPNet, Unet, Res-Unet, HR-Net, Deng et al , DR-Net , Chen et al, RSR-Net and DS-Net, the experiments results showed that the Precision of ReA-Net is improved by 4.86%, 3.13%, 2.1%, 1.15%, 1.26%, ,1.13%, 2.94%, 0.64% ,1.31%, 2.36%, 2.39% and 0.76% respectively; the Recall of ReA-Net is improved by 5.68%, 4.13%, 2.62%, 2.13%, 2.02%, 2.84%, 2.02%, 0.87% ,1.38%, 0.12%, 3.43% and 0.62% respectively; the IoU of ReA-Net is improved by 9.14%, 6.4%, 4.18%, 2.92%, 3.34%, 3.55% , 5.09%, 1.3% ,3.3%, 2.21%, 3.28% and 1.20% respectively; the F1-Score of ReA-Net is improved by 5.25%, 3.64%, 2.44%, 1.64%, 1.87%, 1.99%, 2.48%, 0.73% ,1.84.%, 1.24%, 2.91% and 0.68% respectively.…”
Section: Visualizationmentioning
confidence: 99%
See 1 more Smart Citation
“…The building footprints are either obtained from the satellite images, or, if available, from map data such as OSM. This paper focuses on the classification and post-processing part, as there are various other studies focusing on the building extraction task [10,11,12].…”
Section: Methodsmentioning
confidence: 99%
“…In order to leverage large-scale contextual information and extract critical cues for identifying building pixels in the presence of complex background and when there is occlusion, researchers have proposed methods to capture local and long-range spatial dependencies among the ground entities in the aerial scene [55], [56]. Several researchers are also using transformers [60], attention modules [12], [61]- [63], and multiscale information [8], [43], [45], [46], [64]- [66] for this purpose. Recently, multiview satellite images [67], [68] are also being used to perform semantic segmentation of points on ground.…”
Section: Related Workmentioning
confidence: 99%