U-Net-Id, an Instance Segmentation Model for Building Extraction from Satellite Images—Case Study in the Joanópolis City, Brazil

Wagner, Fabien; Dalagnol, Ricardo; Tarabalka, Yuliya; Segantine, Tassiana Y. F.; Thomé, Rogério; Hirye, Mayumi C. M.

doi:10.3390/rs12101544

Cited by 42 publications

(19 citation statements)

References 25 publications

(36 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Reinforcing the findings of Uhl et al [25] specifically, our study highlights the value of large, quality training datasets for training DL semantic segmentation algorithms to recognize features in topographic maps. Lastly, our study reinforces the documented strong performance of the UNet semantic segmentation method for extracting features and classifying pixels from a wide variety of data sources to support varying mapping tasks (for example, [54,55,[62][63][64][65][66][67]69,70,[91][92][93]). Such techniques, including future advancements and modifications, may eventually replace traditional ML methods, such as random forests (RF) and support vector machines (SVM), as operational standards in the field [46,[52][53][54][55].…”

Section: Discussionsupporting

confidence: 75%

“…UNet and other semantic segmentation methods have been applied to a variety of feature extraction and classification problems and have also been applied to a variety of geospatial and remotely sensed data. For example, modifications of UNet have been applied to the mapping of general land cover change [62], coastal wetlands [63], palm trees [64], cloud and cloud shadows [65], urban buildings and change detection [66][67][68], roads [69], and landslides [70]. Generally, UNet and other FCNs have shown great promise due to their ability to model complex spatial patterns and context while generating data abstractions that generalize well to new data [54,55].…”

Section: Deep Learning Semantic Segmentationmentioning

confidence: 99%

See 1 more Smart Citation

Semantic Segmentation Deep Learning for Extracting Surface Mine Extents from Historic Topographic Maps

et al. 2020

View full text Add to dashboard Cite

Historic topographic maps, which are georeferenced and made publicly available by the United States Geological Survey (USGS) and the National Map’s Historical Topographic Map Collection (HTMC), are a valuable source of historic land cover and land use (LCLU) information that could be used to expand the historic record when combined with data from moderate spatial resolution Earth observation missions. This is especially true for landscape disturbances that have a long and complex historic record, such as surface coal mining in the Appalachian region of the eastern United States. In this study, we investigate this specific mapping problem using modified UNet semantic segmentation deep learning (DL), which is based on convolutional neural networks (CNNs), and a large example dataset of historic surface mine disturbance extents from the USGS Geology, Geophysics, and Geochemistry Science Center (GGGSC). The primary objectives of this study are to (1) evaluate model generalization to new geographic extents and topographic maps and (2) to assess the impact of training sample size, or the number of manually interpreted topographic maps, on model performance. Using data from the state of Kentucky, our findings suggest that DL semantic segmentation can detect surface mine disturbance features from topographic maps with a high level of accuracy (Dice coefficient = 0.902) and relatively balanced omission and commission error rates (Precision = 0.891, Recall = 0.917). When the model is applied to new topographic maps in Ohio and Virginia to assess generalization, model performance decreases; however, performance is still strong (Ohio Dice coefficient = 0.837 and Virginia Dice coefficient = 0.763). Further, when reducing the number of topographic maps used to derive training image chips from 84 to 15, model performance was only slightly reduced, suggesting that models that generalize well to new data and geographic extents may not require a large training set. We suggest the incorporation of DL semantic segmentation methods into applied workflows to decrease manual digitizing labor requirements and call for additional research associated with applying semantic segmentation methods to alternative cartographic representations to supplement research focused on multispectral image analysis and classification.

show abstract

Section: Discussionsupporting

confidence: 75%

Section: Deep Learning Semantic Segmentationmentioning

confidence: 99%

Semantic Segmentation Deep Learning for Extracting Surface Mine Extents from Historic Topographic Maps

et al. 2020

View full text Add to dashboard Cite

show abstract

“…With the development of DCNNs in recent years, many algorithms have been proposed for processing remote sensing images [25][26][27][28][29][30][31][32]. The fully convolutional network [33] (FCN) replaces the fully connected layers with convolutional layers, making it possible for large-scale dense prediction.…”

Section: Related Workmentioning

confidence: 99%

Joint Learning of Contour and Structure for Boundary-Preserved Building Extraction

Liao

Han

et al. 2021

Remote Sensing

View full text Add to dashboard Cite

Most of the existing approaches to the extraction of buildings from high-resolution orthoimages consider the problem as semantic segmentation, which extracts a pixel-wise mask for buildings and trains end-to-end with manually labeled building maps. However, as buildings are highly structured, such a strategy suffers several problems, such as blurred boundaries and the adhesion to close objects. To alleviate the above problems, we proposed a new strategy that also considers the contours of the buildings. Both the contours and structures of the buildings are jointly learned in the same network. The contours are learnable because the boundary of the mask labels of buildings implicitly represents the contours of buildings. We utilized the building contour information embedded in the labels to optimize the representation of building boundaries, then combined the contour information with multi-scale semantic features to enhance the robustness to image spatial resolution. The experimental results showed that the proposed method achieved 91.64%, 81.34%, and 74.51% intersection over union (IoU) on the WHU, Aerial, and Massachusetts building datasets, and outperformed the state-of-the-art (SOTA) methods. It significantly improved the accuracy of building boundaries, especially for the edges of adjacent buildings. The code is made publicly available.

show abstract

“…Promising building footprint detection approaches have been proposed in the literature. Wagner et al [40] presented a modified U-Net capable of discriminating between adjacent buildings. To incorporate the structure information of buildings, Hui et al [41] opted for a multi-task learning strategy, replacing the vanilla U-Net encoder with an Xception module.…”

Section: Semantic Segmentation Of Buildings and Roadsmentioning

confidence: 99%

A Deep Learning Approach to an Enhanced Building Footprint and Road Detection in High-Resolution Satellite Imagery

et al. 2021

View full text Add to dashboard Cite

The detection of building footprints and road networks has many useful applications including the monitoring of urban development, real-time navigation, etc. Taking into account that a great deal of human attention is required by these remote sensing tasks, a lot of effort has been made to automate them. However, the vast majority of the approaches rely on very high-resolution satellite imagery (<2.5 m) whose costs are not yet affordable for maintaining up-to-date maps. Working with the limited spatial resolution provided by high-resolution satellite imagery such as Sentinel-1 and Sentinel-2 (10 m) makes it hard to detect buildings and roads, since these labels may coexist within the same pixel. This paper focuses on this problem and presents a novel methodology capable of detecting building and roads with sub-pixel width by increasing the resolution of the output masks. This methodology consists of fusing Sentinel-1 and Sentinel-2 data (at 10 m) together with OpenStreetMap to train deep learning models for building and road detection at 2.5 m. This becomes possible thanks to the usage of OpenStreetMap vector data, which can be rasterized to any desired resolution. Accordingly, a few simple yet effective modifications of the U-Net architecture are proposed to not only semantically segment the input image, but also to learn how to enhance the resolution of the output masks. As a result, generated mappings quadruplicate the input spatial resolution, closing the gap between satellite and aerial imagery for building and road detection. To properly evaluate the generalization capabilities of the proposed methodology, a data-set composed of 44 cities across the Spanish territory have been considered and divided into training and testing cities. Both quantitative and qualitative results show that high-resolution satellite imagery can be used for sub-pixel width building and road detection following the proper methodology.

show abstract

U-Net-Id, an Instance Segmentation Model for Building Extraction from Satellite Images—Case Study in the Joanópolis City, Brazil

Cited by 42 publications

References 25 publications

Semantic Segmentation Deep Learning for Extracting Surface Mine Extents from Historic Topographic Maps

Semantic Segmentation Deep Learning for Extracting Surface Mine Extents from Historic Topographic Maps

Joint Learning of Contour and Structure for Boundary-Preserved Building Extraction

A Deep Learning Approach to an Enhanced Building Footprint and Road Detection in High-Resolution Satellite Imagery

Contact Info

Product

Resources

About