The Total Operating Characteristic from Stratified Random Sampling with an Application to Flood Mapping

Liu, Zhen; Pontius, Robert Gilmore

doi:10.3390/rs13193922

Cited by 15 publications

(6 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Through our results for the typical semantic segmentation networks with different structures, we verified the generalizability and effectiveness of the multiclass complexitybased optimal sampling method. Previous studies [44,53,54,80,81] have shown that the stratified sampling method can obtain training samples from different strata (regions), potentially improving the level of classification accuracy. However, the performance improvement in these studies depended on correctly stratifying (partitioning) the data, as there is no quantified standard indicator to measure the contribution of stratification to performance, and many have overlooked the significant contribution of each individual sample to the model's generalization capability for prediction.…”

Section: Discussionmentioning

confidence: 99%

Geocomplexity Statistical Indicator to Enhance Multiclass Semantic Segmentation of Remotely Sensed Data with Less Sampling Bias

He,

Li,

Gao

2024

Remote Sensing

View full text Add to dashboard Cite

Challenges in enhancing the multiclass segmentation of remotely sensed data include expensive and scarce labeled samples, complex geo-surface scenes, and resulting biases. The intricate nature of geographical surfaces, comprising varying elements and features, introduces significant complexity to the task of segmentation. The limited label data used to train segmentation models may exhibit biases due to imbalances or the inadequate representation of certain surface types or features. For applications like land use/cover monitoring, the assumption of evenly distributed simple random sampling may be not satisfied due to spatial stratified heterogeneity, introducing biases that can adversely impact the model’s ability to generalize effectively across diverse geographical areas. We introduced two statistical indicators to encode the complexity of geo-features under multiclass scenes and designed a corresponding optimal sampling scheme to select representative samples to reduce sampling bias during machine learning model training, especially that of deep learning models. The results of the complexity scores showed that the entropy-based and gray-based indicators effectively detected the complexity from geo-surface scenes: the entropy-based indicator was sensitive to the boundaries of different classes and the contours of geographical objects, while the Moran’s I indicator had a better performance in identifying the spatial structure information of geographical objects in remote sensing images. According to the complexity scores, the optimal sampling methods appropriately adapted the distribution of the training samples to the geo-context and enhanced their representativeness relative to the population. The single-score optimal sampling method achieved the highest improvement in DeepLab-V3 (increasing pixel accuracy by 0.3% and MIoU by 5.5%), and the multi-score optimal sampling method achieved the highest improvement in SegFormer (increasing ACC by 0.2% and MIoU by 2.4%). These findings carry significant implications for quantifying the complexity of geo-surface scenes and hence can enhance the semantic segmentation of high-resolution remote sensing images with less sampling bias.

show abstract

Section: Discussionmentioning

confidence: 99%

Geocomplexity Statistical Indicator to Enhance Multiclass Semantic Segmentation of Remotely Sensed Data with Less Sampling Bias

He,

Li,

Gao

2024

Remote Sensing

View full text Add to dashboard Cite

show abstract

“…It would be worth investigating and implementing a variety of statistical machine learning or deep learning approaches in the lulcc package in the future, such as the mixed effects model, ensemble learning or a variant of neural networks with several tuning hyperparameters to obtain the optimal performance. The final step is to validate the model using the Total Operating Characteristic (TOC) created by Liu et al (2021) to substitute the popular ROC that claims to offer more information and a distinct interpretation.…”

Section: Discussionmentioning

confidence: 99%

Land Use Change Modelling Using Logistic Regression, Random Forest and Additive Logistic Regression in Kubu Raya Regency, West Kalimantan

Pradana,

Djuraidah,

Soleh

2023

For. Geo.

View full text Add to dashboard Cite

Kubu Raya Regency is a regency in the province of West Kalimantan which has a wetland ecosystem including a high-density swamp or peatland ecosystem along with an extensive area of mangroves. The function of wetland ecosystems is essential for fauna, as a source of livelihood for the surrounding community and as storage reservoir for carbon stocks. Most of the land in Kubu Raya Regency is peatland. As a consequence, peat has long been used for agriculture and as a source of livelihood for the community. Along with the vast area of peat, the regency also has a potential high risk of peat fires. This study aims to predict land use changes in Kubu Raya Regency using three statistical machine learning models, specifically Logistic Regression (LR), Random Forest (RF) and Additive Logistic Regression (ALR). Land cover map data were acquired from the Ministry of Environment and Forestry and subsequently reclassified into six types of land cover at a resolution of 100 m. The land cover data were employed to classify land use or land cover class for the Kubu Raya regency, for the years 2009, 2015 and 2020. Based on model performance, RF provides greater accuracy and F1 score as opposed to LR and ALR. The outcome of this study is expected to provide knowledge and recommendations that may aid in developing future sustainable development planning and management for Kubu Raya Regency.

show abstract

“…Compared with ROC, TOC can provide more useful information in adopting the same data and graphics space [ 59 , 60 ]. TOC software is available from (accessed on 30 October 2021) [ 61 ]. Making the TOC for each category is based on the expensed Boolean map and gain probability map.…”

Section: Methodsmentioning

confidence: 99%

“…TOC software is available from https://lazygis. github.io/projects/TOCCurveGenerator (accessed on 30 October 2021) [61]. Making the TOC for each category is based on the expensed Boolean map and gain probability map.…”

Section: Validationmentioning

confidence: 99%

Intensity Characteristics and Multi-Scenario Projection of Land Use and Land Cover Change in Hengyang, China

Deng

Quan

2022

IJERPH

View full text Add to dashboard Cite

Intensity analysis has generally been applied as a top-bottom hierarchical accounting method to understand regional dynamic characteristics of land use and land cover (LULC) change. Given the inconvenience of transition level in the detailed and overall presentation of various category transitions at multiple intervals, a novel transition pattern is proposed to represent the transition’s size and intensity and to intuitively identify the stationary mode of transition, which helps the transition level to connect to the mode with the process. An intensity analysis was conducted to communicate the transition between LULC categories in Hengyang from 1980 to 2015. The patch-generating land use simulation (PLUS) model was employed for multi-scenario projection from 2015 to 2045. From 1980 to 2015, 2005 was a significant turning point in the speed of LULC change in Hengyang, and the change rate after this time point was three times that before the time point. The gain of built-up and bare, and the loss of cultivated was always active. The reason for the large loss of forest is that forest comprises the largest proportion of Hengyang. The loss of cultivated and the loss of forest contributing to the built-up’s gain is much larger, but the mechanism behind the transition differed. A stationary targeting transition mode from cultivated to built-up in Hengyang was detected. The PLUS model confirmed that the area of forest, cultivated and grass will reduce, and the rate of decrease will slow down in the future, while water areas will slightly increase. Our work enriches the methodology of intensity analysis and provides a scientific reference for the sustainable development and management of land resources in Hengyang.

show abstract

The Total Operating Characteristic from Stratified Random Sampling with an Application to Flood Mapping

Cited by 15 publications

References 22 publications

Geocomplexity Statistical Indicator to Enhance Multiclass Semantic Segmentation of Remotely Sensed Data with Less Sampling Bias

Geocomplexity Statistical Indicator to Enhance Multiclass Semantic Segmentation of Remotely Sensed Data with Less Sampling Bias

Land Use Change Modelling Using Logistic Regression, Random Forest and Additive Logistic Regression in Kubu Raya Regency, West Kalimantan

Intensity Characteristics and Multi-Scenario Projection of Land Use and Land Cover Change in Hengyang, China

Contact Info

Product

Resources

About