MAD-UNet: A Multi-Region UAV Remote Sensing Network for Rural Building Extraction

Xue, Hang; Liu, Ke; Wang, Yumeng; Chen, Yuxin; Huang, Caiyi; Wang, Pengfei; Li, Lin

doi:10.3390/s24082393

Cited by 2 publications

(3 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To verify the effectiveness of the proposed AFSF network, we conducted experiments on the new roof segmentation dataset and two common remote sensing semantic image segmentation datasets, i.e., the Inria Aerial Image Labeling (IAIL) [19] and WHU [20] datasets, and compare the results with several state-of-the-art methods, including Deeplab V3 [55], UNet [37], U2Net [44], HED [56], RCF [57], BASNet [58], MA [42], AS-UNet++ [59], and MAD-UNet [60].…”

Section: Comparison With the State Of The Artmentioning

confidence: 99%

“…For example, on the IAIL dataset, the proposed AFSF network outperforms the second-best model, AS-UNet++ [59], by 0.7%, 0.7%, and 1.1% in terms of the precision, recall, and IoU, respectively; on the WHU dataset, the proposed AFSF network outperforms the second-best model, AS-UNet++ [59], by 0.6%, 0.5%, and 0.8% in terms of the precision, recall, and IoU, respectively. [58] 95.9 92.9 89.3 HED [56] 94.2 90.4 85.6 RCF [57] 95.0 90.7 86.6 MA [42] 96.0 93.2 89.7 AS-UNet++ [59] 96.1 93.4 90.1 MAD-UNet [60] 95.9 93.0 89.6 AFSF 96.7 93.9 90.9 Figure 6. Some examples of prediction masks generated by the proposed AFSF and compared methods.…”

Section: Comparison With the State Of The Artmentioning

confidence: 99%

See 1 more Smart Citation

An Attention-Based Full-Scale Fusion Network for Segmenting Roof Mask from Satellite Images

Cheng,

Liu,

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

Accurately segmenting building roofs from satellite images is crucial for evaluating the photovoltaic power generation potential of urban roofs and is a worthwhile research topic. In this study, we propose an attention-based full-scale fusion (AFSF) network to segment a roof mask from the given satellite images. By developing an attention-based residual ublock, the channel relationship of the feature maps can be modeled. By integrating attention mechanisms in multi-scale feature fusion, the model can learn different weights for features of different scales. We also design a ladder-like network to utilize weakly labeled data, thereby achieving pixel-level semantic segmentation tasks assisted by image-level classification tasks. In addition, we contribute a new roof segmentation dataset, which is based on satellite images and uses the roof as the segmentation target rather than the entire building to further promote the algorithm research of estimating roof area using satellite images. The experimental results on the new roof segmentation dataset, WHU dataset, and IAIL dataset demonstrate the effectiveness of the proposed network.

show abstract

Section: Comparison With the State Of The Artmentioning

confidence: 99%

Section: Comparison With the State Of The Artmentioning

confidence: 99%

An Attention-Based Full-Scale Fusion Network for Segmenting Roof Mask from Satellite Images

Cheng,

Liu,

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

show abstract

“…Constrained by the traditional algorithm design difficulties, whereby it is difficult to ensure real-time target segmentation in the complex background of poor results and other issues, researchers have begun to choose to use deep learning-based image semantic segmentation methods to build segmentation models to complete the target segmentation task. Currently, the design of segmentation models based on the encoder–decoder structure of a full convolutional neural network FCN [ 9 ] is quite extensive, among which, due to the relatively simple structure of the U-Net [ 10 ] model and its outstanding segmentation performance, it and its variants have now achieved remarkable results in the semantic segmentation tasks of images such as medicine [ 11 ], traffic [ 12 ], agriculture [ 13 ], aerial photography [ 14 ], remote sensing [ 15 ], and so on. O. Oktay et al [ 16 ] proposed a novel Attention Gate (AG) model for the medical image domain, which can automatically learn to focus on target structures of different shapes and sizes, and integrated it into the U-Net network architecture to build the Attention U-Net network, which reduces the computational overheads of the original U-Net model, and improves the model’s sensitivity and computational accuracy.…”

Section: Related Workmentioning

confidence: 99%

RCEAU-Net: Cascade Multi-Scale Convolution and Attention-Mechanism-Based Network for Laser Beam Target Image Segmentation with Complex Background in Coal Mine

Yang,

Wang,

Zhang

et al. 2024

Sensors

View full text Add to dashboard Cite

Accurate and reliable pose estimation of boom-type roadheaders is the key to the forming quality of the tunneling face in coal mines, which is of great importance to improve tunneling efficiency and ensure the safety of coal mine production. The multi-laser-beam target-based visual localization method is an effective way to realize accurate and reliable pose estimation of a roadheader body. However, the complex background interference in coal mines brings great challenges to the stable and accurate segmentation and extraction of laser beam features, which has become the main problem faced by the long-distance visual positioning method of underground equipment. In this paper, a semantic segmentation network for underground laser beams in coal mines, RCEAU-Net, is proposed based on U-Net. The network introduces residual connections in the convolution of the encoder and decoder parts, which effectively fuses the underlying feature information and improves the gradient circulation performance of the network. At the same time, by introducing cascade multi-scale convolution in the skipping connection section, which compensates for the lack of contextual semantic information in U-Net and improves the segmentation effect of the network model on tiny laser beams at long distance. Finally, the introduction of an efficient multi-scale attention module with cross-spatial learning in the encoder enhances the feature extraction capability of the network. Furthermore, the laser beam target dataset (LBTD) is constructed based on laser beam target images collected from several coal mines, and the proposed RCEAU-Net model is then tested and verified. The experimental results show that, compared with the original U-Net, RCEAU-Net can ensure the real-time performance of laser beam segmentation while increasing the Accuracy by 0.19%, Precision by 2.53%, Recall by 22.01%, and Intersection and Union Ratio by 8.48%, which can meet the requirements of multi-laser-beam feature segmentation and extraction under complex backgrounds in coal mines, so as to further ensure the accuracy and stability of long-distance visual positioning for boom-type roadheaders and ensure the safe production in the working face.

show abstract

MAD-UNet: A Multi-Region UAV Remote Sensing Network for Rural Building Extraction

Cited by 2 publications

References 42 publications

An Attention-Based Full-Scale Fusion Network for Segmenting Roof Mask from Satellite Images

An Attention-Based Full-Scale Fusion Network for Segmenting Roof Mask from Satellite Images

RCEAU-Net: Cascade Multi-Scale Convolution and Attention-Mechanism-Based Network for Laser Beam Target Image Segmentation with Complex Background in Coal Mine

Contact Info

Product

Resources

About