Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still 'aspatial' and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.
Since 1999, very high spatial resolution satellite data represent the surface of the Earth with more detail. However, information extraction by per pixel multispectral classification techniques proves to be very complex owing to the internal variability increase in land-cover units and to the weakness of spectral resolution. Image segmentation before classification was proposed as an alternative approach, but a large variety of segmentation algorithms were developed during the last 20 years, and a comparison of their implementation on very high spatial resolution images is necessary. In this study, four algorithms from the two main groups of segmentation algorithms (boundarybased and region-based) were evaluated and compared. In order to compare the algorithms, an evaluation of each algorithm was carried out with empirical discrepancy evaluation methods. This evaluation is carried out with a visual segmentation of Ikonos panchromatic images. The results show that the choice of parameters is very important and has a great influence on the segmentation results. The selected boundary-based algorithms are sensitive to the noise or texture. Better results are obtained with regionbased algorithms, but a problem with the transition zones between the contrasted objects can be present.
In this letter the recently developed Extreme Gradient Boosting (Xgboost) classifier is implemented in a veryhigh-resolution (VHR) object-based urban Land Use-Land Cover application. In detail, we investigated the sensitivity of Xgboost to various sample sizes, as well as to feature selection (FS) by applying a standard technique, Correlation Based Feature Selection. We compared Xgboost with benchmark classifiers such as Random Forest (RF) and Support Vector Machines (SVM). The methods are applied to VHR imagery of two Sub-Saharan cities of Dakar and Ouagadougou and the village of Vaihingen, Germany. The results demonstrate that, Xgboost parametrized with a Bayesian procedure, systematically outperformed RF and SVM, mainly in larger sample sizes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.